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Preface 


This text treats Lie groups, Lie algebras, and their representations. My pedagogical 
goals are twofold. First, I strive to develop the theory of Lie groups in an elementary 
fashion, with minimal prerequisites. In particular, in Part I, I develop the theory 
of (matrix) Lie groups and their Lie algebras using only linear algebra, without 
requiring any knowledge of manifold theory. Second, I strive to provide more 
motivation and intuition for the proofs, often using a figure, than in some of the 
classic texts on the subject. At the same time, I aim to be fully rigorous; an 
explanation or figure is a supplement to and not a replacement for a traditional 
proof. 

Although Lie theory is widely used in both mathematics and physics, there is 
often a wide gulf between the presentations of the subject in the two disciplines: 
Physics books get down to business quickly but are often imprecise in definitions 
and statements of theorems, whereas math books are more rigorous but often 
have a high barrier to entry. It is my hope that this book will be useful to both 
mathematicians and physicists. In particular, the matrix approach in Part I allows 
for definitions that are precise but comprehensible. Although I do not delve into the 
details of how Lie algebras are used in particle theory, I do include an extended 
discussion of the representations of SU(3), which has obvious applications to that 
field. (My recent book, Quantum Theory for Mathematicians [Hall], also aims 
to bridge a gap between the mathematics and physics literatures, and it contains 
some discussion of Lie-theoretic issues in quantum mechanics. The emphasis there, 
however, is on nonrelativistic quantum mechanics and not on quantum field theory.) 


Content of the Book Part I of the text covers the general theory of matrix Lie 
groups (i.e., closed subgroups of GL(n;C)) and their Lie algebras. Chapter 1 
introduces numerous examples of matrix Lie groups and examines their topological 
properties. After discussing the matrix exponential in Chapter 2, I turn to Lie 
algebras in Chapter 3, examining both abstract Lie algebras and Lie algebras 
associated with matrix Lie groups. Chapter 3 shows, among other things, that every 
matrix Lie group is an embedded submanifold of GL(n; C) and, thus, a Lie group. 
In Chapter 4, I consider elementary representation theory. Finally, Chapter 5 covers 


xi 
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the Baker—-Campbell—Hausdorff formula and its consequences. I use this formula 
(in place of the more traditional Frobenius theorem) to establish some of the deeper 
results about the relationship between Lie groups and Lie algebras. 

Part II of the text covers semisimple Lie algebras and their representations. I 
begin with an entire chapter on the representation theory of sl(3; C), that is, the 
complexification of the Lie algebra of the group SU(3). On the one hand, this 
example can be treated in an elementary way, simply by writing down a basis and 
calculating. On the other hand, this example allows the reader to see the machinery 
of roots, weights, and the Weyl group in action in a simple example, thus motivating 
the general version of these structures. For the general case, I use an unconventional 
definition of “semisimple,” namely that a complex Lie algebra is semisimple if it has 
trivial center and is the complexification of the Lie algebra of a compact group. I 
show that every such Lie algebra decomposes as a direct sum of simple algebras, and 
is thus semisimple in the conventional sense. Actually, every complex Lie algebra 
that is semisimple in the conventional sense has a “compact real form,” so that my 
definition of semisimple is equivalent to the standard one—but I do not prove this 
claim. As with the choice to consider matrix Lie groups in Part I, this (apparent) 
reduction in scope allows for a rapid development of the structure of semisimple Lie 
algebras. After developing the necessary properties of root systems in Chapter 8, I 
give the classification of representations in Chapter 9, as expressed in the theorem 
of the highest weight. Finally, Chapter 10 gives several additional properties of the 
representations, including complete reducibility, the Weyl character formula, and 
the Kostant multiplicity formula. 

Finally, Part III of the book presents the compact-group approach to represen- 
tation theory. Chapter 11 gives a proof of the torus theorem and establishes the 
equivalence between the Lie-group and Lie-algebra definitions of the Weyl group. 
This chapter does, however, make use of some of the manifold theory that I avoided 
previously. The reader who is unfamiliar with manifold theory but willing to take 
a few things on faith should be able to proceed on to Chapter 12, where I develop 
the Weyl character formula and the theorem of the highest weight from the compact 
group point of view. In particular, Chapter 12 gives a self-contained construction 
of the representations, independent of the Lie-algebraic argument in Chapter 9. 
Lastly, in Chapter 13, I examine the fundamental group of a compact group from 
two different perspectives, one that treats the classical groups by induction on the 
dimension and one that is based on the torus theorem and uses the structure of the 
root system. This chapter shows, among other things, that for a simply connected 
compact group, the integral elements from the group point of view coincide with 
the integral elements from the Lie algebra point of view. This result shows that for 
simply connected compact groups, the theorem of the highest weight for the group 
is equivalent to the theorem of the highest weight for the Lie algebra. 

The first four chapters of the book cover elementary Lie theory and could be used 
for an undergraduate course. At the graduate level, one could pass quickly through 
Part I and then cover either Part II or Part HII, depending on the interests of the 
instructor. Although I have tried to explain and motivate the results in Parts II and II 
of the book, using figures whenever possible, the material there is unquestionably 
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more challenging than in Part I. Nevertheless, I hope that the explicit working out of 
the case of the Lie algebra sI(3; C) (or, equivalently, the group SU(3)) in Chapter 6 
will give the reader a good sense of the flavor of the results in the subsequent 
chapters. 

In recent years, there have been several other books on Lie theory that use the 
matrix-group approach. Of these, the book of Rossmann [Ross] is most similar in 
style to my own. The first three chapters of [Ross] cover much of the same material 
as the first four chapters of this book. Although the organization of my book is, 
I believe, substantially different from that of other books on the subject, I make 
no claim to originality in any of the proofs. I myself learned most of the material 
here from the books of Brécker and tom Dieck [BtD], Humphreys [Hum], and 
Miller [Mill]. 


New Features of Second Edition This second edition of the book is substantially 
expanded from the first edition. Part I has been reorganized but covers mostly the 
same material as in the first edition. In Part II, however, at least half of the material is 
new. Chapter 8 now provides a complete derivation of all relevant properties of root 
systems. In Chapter 9, the construction of the finite-dimensional representations of 
a semisimple Lie algebra has been fleshed out, with the definition of the universal 
enveloping algebra, a proof of the Poincaré—Birkhoff—Witt theorem, and a proof of 
the existence of Verma modules. Chapter 10 is mostly new and includes complete 
proofs of the Weyl character formula, the Weyl dimension formula, and the Kostant 
multiplicity formula. Part III, on the structure and representation theory of compact 
groups, is new in this edition. 

I have also included many more figures in the second edition. The black-and- 
white images were created in Mathematica, while the color images in Sect. 8.9 
were modeled in the Zometool system (www.zometool.com) and rendered in Scott 
Vorthmann’s vZome program (vzome.com). I thank Paul Hildebrandt for assisting 
me with construction of the Zometool models and Scott Vorthmann for going above 
and beyond in assisting me with use of vZome. 


Acknowledgments I am grateful for the input of many people on various versions of this text, 
which has improved it immensely. Contributors to the first printing of the first edition include Ed 
Bueler, Wesley Calvert, Tom Goebeler, Ruth Gornet, Keith Hubbard, Wicharn Lewkeeratiyutkul, 
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Part I 
General Theory 


Chapter 1 
Matrix Lie Groups 


1.1 Definitions 


A Lie group is, roughly speaking, a continuous group, that is, a group described 
by several real parameters. In this book, we consider matrix Lie groups, which are 
Lie groups realized as groups of matrices. As an example, consider the set of all 
2 x 2 real matrices with determinant 1, customarily denoted SL(2; R). Since the 
determinant of a product is the product of the determinants, this set forms a group 
under the operation of matrix multiplication. If we think of the set of all 2 x 2 
matrices, with entries a,b,c, d, as R*, then SL(2; R) is the set of points in R4 for 
which the smooth function ad — bc has the value 1. 

Suppose f is a smooth function on R* and we consider the set E where f(x) 
equals some constant value c. If, at each point xo in Æ, at least one of the partial 
derivatives of f is nonzero, then the implicit function theorem tells us that we can 
solve the equation f(x) = c near Xo for one of the variables as a function of the 
other k — 1 variables. Thus, E is a smooth “surface” (or embedded submanifold) in 
R* of dimension k — 1. In the case of SL(2; R) inside R*, we note that the partial 
derivatives of ad—bc with respect to a, b,c, and d are d, —c, —b, and a, respectively. 
Thus, at each point where ad — bc = 1, at least one of these partial derivatives is 
nonzero, and we conclude that SL(2; R) is a smooth surface of dimension 3. Thus, 
SL(2; R) is a Lie group of dimension 3. 

For other groups of matrices (such as the ones we will encounter later in 
this section), one could use a similar approach. The analysis is, however, more 
complicated because most of the groups are defined by setting several different 
smooth functions equal to constants. One therefore has to check that these functions 
are “independent” in the sense of the implicit function theorem, which means that 
their gradient vectors have to be linearly independent at each point in the group. 

We will use an alternative approach that makes all such analysis unnecessary. 
We consider groups G of matrices that are closed in the sense of Definition 1.4. To 
each such G, we will associate in Chapter 3 a “Lie algebra” g, which is a real vector 
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space. A general result (Corollary 3.45) will then show that G is a smooth manifold 
whose dimension is equal the dimension of g as a vector space. 

This chapter makes use of various standard results from linear algebra, which are 
summarized in Appendix A. 


Definition 1.1. The general linear group over the real numbers, denoted 
GL(n; R), is the group of all n x n invertible matrices with real entries. The 
general linear group over the complex numbers, denoted GL(n; C), is the group of 
all n x n invertible matrices with complex entries. 


Definition 1.2. Let M,,(C) denote the space of all n x n matrices with complex 
entries. 


We may identify M,„(C) with C” and use the standard notion of convergence in 
cr”. Explicitly, this means the following. 


Definition 1.3. Let Am be a sequence of complex matrices in M,,(C). We say that 
Am converges to a matrix A if each entry of Am converges (as m — oo) to the 
corresponding entry of A (i.e., if (Am), converges to Aj forall 1 < j,k < n). 


We now consider subgroups of GL(n; C), that is, subsets G of GL(n; C) such 
that the identity matrix is in G and such that for all A and B in G, the matrices AB 
and A`! are also in G. 


Definition 1.4. A matrix Lie group is a subgroup G of GL(n;C) with the 
following property: If Am is any sequence of matrices in G, and Am converges to 
some matrix A, then either A is in G or A is not invertible. 


The condition on G amounts to saying that G is a closed subset of GL(n; C). 
(This does not necessarily mean that G is closed in M,,(C).) Thus, Definition 1.4 is 
equivalent to saying that a matrix Lie group is a closed subgroup of GL(n; C). 

The condition that G be a closed subgroup, as opposed to merely a subgroup, 
should be regarded as a technicality, in that most of the interesting subgroups of 
GL(n; C) have this property. Most of the matrix Lie groups G we will consider 
have the stronger property that if Am is any sequence of matrices in G, and Am 
converges to some matrix A, then A € G (i.e., that G is closed in M,,(C)). 

An example of a subgroup of GL(n; C) which is not closed (and hence is not a 
matrix Lie group) is the set of all n xn invertible matrices with rational entries. This 
set is, in fact, a subgroup of GL(n; C), but not a closed subgroup. That is, one can 
(easily) have a sequence of invertible matrices with rational entries converging to an 
invertible matrix with some irrational entries. (In fact, every real invertible matrix is 
the limit of some sequence of invertible matrices with rational entries.) 

Another example of a group of matrices which is not a matrix Lie group is the 
following subgroup of GL(2; C). Let a be an irrational real number and let 


et 0 
G= j 


ter}. (1.1) 


1.2 Examples 5 


2n 2: 


Fig. 1.1 A small portion of the group G inside G (left) and a larger portion (right) 


Clearly, G is a subgroup of GL(2; C). According to Exercise 10, the closure of G is 


the group 
id 
= e” 0 
= ( 0 vo) |oo ER}. 


The group G inside G is known as an “irrational line in a torus”; see Figure 1.1. 


1.2 Examples 


Mastering the subject of Lie groups involves not only learning the general theory 
but also familiarizing oneself with examples. In this section, we introduce some 
of the most important examples of (matrix) Lie groups. Among these are the 
classical groups, consisting of the general and special linear groups, the unitary 
and orthogonal groups, and the symplectic groups. The classical groups, and their 
associated Lie algebras, will be key examples in Parts II and III of the book. 


1.2.1 General and Special Linear Groups 


The general linear groups (over R or C) are themselves matrix Lie groups. Of 
course, GL(n;C) is a subgroup of itself. Furthermore, if Am is a sequence of 
matrices in GL(n; C) and Am converges to A, then by the definition of GL(n; C), 
either A is in GL(n; C), or A is not invertible. 
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Moreover, GL(n; R) is a subgroup of GL(n; C), and if Am € GL(n;R) and Am 
converges to A, then the entries of A are real. Thus, either A is not invertible or 
A € GL(n; R). 

The special linear group (over R or C) is the group of n x n invertible matrices 
(with real or complex entries) having determinant one. Both of these are subgroups 
of GL(n; C). Furthermore, if A,, is a sequence of matrices with determinant one and 
A, converges to A, then A also has determinant one, because the determinant is a 
continuous function. Thus, SL(n; R) and SL (n; C) are matrix Lie groups. 


1.2.2 Unitary and Orthogonal Groups 


Ann x n complex matrix A is said to be unitary if the column vectors of A are 
orthonormal, that is, if 


XO Ay An = ôi (1.2) 
I=1 
We may rewrite (1.2) as 
YS \(A*)An = Sik, (1.3) 
1=1 


where ô; is the Kronecker delta, equal to 1 if j = k and equal to zero if j A k. 
Here A* is the adjoint of A, defined by 


(A*) jx = Ay. 
Equation (1.3) says that A*A = J; thus, we see that A is unitary if and only if 
A* = A`™!. In particular, every unitary matrix is invertible. 
The adjoint operation on matrices satisfies (AB)* = B*A*. From this, we can 
see that if A and B are unitary, then 


(AB)*(AB) = B* A*AB = B"'A'AB= I, 


showing that AB is also unitary. Furthermore, since (AA~!)* = I* = I, we see that 
(A7!)* A* = I, which shows that (A~!)* = (A*)~!. Thus, if A is unitary, we have 


(A> \*A~ = (1) 747 = (AA*)~! =Z I, 
showing that A7! is again unitary. 


Thus, the collection of unitary matrices is a subgroup of GL(n; C). We call this 
group the unitary group and we denote it by U(7). We may also define the special 
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unitary group SU(7), the subgroup of U(n) consisting of unitary matrices with 
determinant 1. It is easy to check that both U(7) and SU(n) are closed subgroups of 
GL(n; C) and thus matrix Lie groups. 

Meanwhile, let (-, -) denote the standard inner product on C”, given by 


(x,y) = >} ae 
J 
(Note that we put the conjugate on the first factor in the inner product.) By 
Proposition A.8, we have 
(x, Ay) = (A*x, y) 
for all x, y € C”. Thus, 
(Ax, Ay) = (A*Ax, y) , 


from which we can see that if A is unitary, then A preserves the inner product on 
C”, that is, 


(Ax, Ay) = (x, y) 


for all x and y. Conversely, if A preserves the inner product, we must have 
(A*Ax, y) = (x, y) for all x, y. It is not hard to see that this condition holds only 
if A*A = J. Thus, an equivalent characterization of unitarity is that A is unitary if 
and only if A preserves the standard inner product on C”. 

Finally, for any matrix A, we have that det A* = det A. Thus, if A is unitary, we 
have 


det(A* A) = |det A|? = det I = 1. 


Hence, for all unitary matrices A, we have |det A| = 1. 

In a similar fashion, an n x n real matrix A is said to be orthogonal if the 
column vectors of A are orthonormal. As in the unitary case, we may give equivalent 
versions of this condition. The only difference is that if A is real, A* is the same as 
the transpose A” of A, given by 


(A")ik = Ag. 
Thus, A is orthogonal if and only if A” = A™, and this holds if and only if A 
preserves the inner product on R”. Since det(A”) = det A, if A is orthogonal, we 


have 


det(A" A) = det(A)? = det(J) = 1, 
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so that det(A) = +1. The collection of all orthogonal matrices forms a closed 

subgroup of GL(n; C), which we call the orthogonal group and denote by O(n). 

The set of n xn orthogonal matrices with determinant one is the special orthogonal 

group, denoted SO(n). Geometrically, elements of SO(n) are rotations, while the 

elements of O(n) are either rotations or combinations of rotations and reflections. 
Consider now the bilinear form (-,-) on C” defined by 


(9) = > xy. (1.4) 
J 


This form is not an inner product (Sect. A.6) because, for example, it is symmetric 
rather than conjugate-symmetric. The set of all n x n complex matrices A which 
preserve this form (i.e., such that (Ax, Ay) = (x, y) forall x, y € C”) is the complex 
orthogonal group O(n; C), and it is a subgroup of GL(n; C). Since there are no 
conjugates in the definition of the form (-,-), we have 


(x, Ay) = (Ax, y), 


for all x,y e C”, where on the right-hand side of the above relation, we have 
A” rather than A*. Repeating the arguments for the case of O(n), but now allowing 
complex entries in our matrices, we find that an nxn complex matrix A is in O(n; ©) 
if and only if A” A = J, that O(n; C) is a matrix Lie group, and that det A = +1 
for all A in O(n; C). Note that O(n; C) is not the same as the unitary group U(n). 
The group SO(n; C) is defined to be the set of all A in O(n; C) with det A = 1 and 
it is also a matrix Lie group. 


1.2.3 Generalized Orthogonal and Lorentz Groups 


Let n and k be positive integers, and consider R"t*. Define a symmetric bilinear 
form [-, +], on R"+* by the formula 


[x, Wak = XY aa i Xn Vn — Xn+1Yn+1 — t°  Xn+k Vn+k (1.5) 


The set of (n + k) x (n +k) real matrices A which preserve this form (i.e., such that 
[Ax, Ay], ~ = [x.y], « for all x, y € R”+*) is the generalized orthogonal group 
O(n; k). It is a subgroup of GL(m + k; R) and a matrix Lie group (Exercise 1). Of 
particular interest in physics is the Lorentz group O(3; 1). We also define SO(n; k) 
to be the subgroup of O(n; k) consisting of elements with determinant 1. 

If A is an (n +k) x (n + k) real matrix, let AY) denote the jth column vector 
of A, that is, 


Ay jj 
AW = 


An+k,j 
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Note that A? is equal to Ae j» that is, the result of applying A to the jth standard 
basis element ej. Then A will belong to O(n;k) if and only [Ae;,Ae;] for all 
1 < j,l <n+k. Explicitly, this means that A € O(n; k) if and only if the following 
conditions are satisfied: 


II 


[AM, AP] 


[AM, AD] 


n 


[AW), A] ‘ 


0 fF, 
1 l<j<n, (1.6) 
-l n+1<j<n+k. 


II 


II 


Let g denote the (n +k) x(n +k) diagonal matrix with ones in the first n diagonal 
entries and minus ones in the last k diagonal entries: 


1 


—1 


Then A is in O(n; k) if and only if A” gA = g (Exercise 1). Taking the determinant 
of this equation gives (det A)? det g = det g, or (det A)? = 1. Thus, for any A in 
O(n; k), det A = +1. 


1.2.4 Symplectic Groups 


Consider the skew-symmetric bilinear form B on R” defined as follows: 


n 


olx, y) = Sry Yn+j — Xn+j Yj) (1.7) 


j=l 


The set of all 2n x 2n matrices A which preserve w (i.e., such that w(Ax, Ay) = 
w(x, y) for all x, y € R?”) is the real symplectic group Sp(n; R), and it is a closed 
subgroup of GL(27; R). (Some authors refer to the group we have just defined as 
Sp(2n; R) rather than Sp(n; R).) If Q is the 2n x 2n matrix 


0 I 
a=(% 5): (1.8) 


then 


w(x, y) = (x, Qy). 
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From this, it is not hard to show that a 2n x 2n real matrix A belongs to Sp(n; R) if 
and only if 


— QAQ = A™!. (1.9) 


(See Exercise 2.) Taking the determinant of this identity gives det A = (det A)!, 
i.e., (det A)? = 1. This shows that det A = +1, forall A € Sp(n; R). In fact, 
det A = 1 for all A € Sp(n; R), although this is not obvious. 

One can define a bilinear form œ on C” by the same formula as in (1.7) (with 
no conjugates). Over C, we have the relation 


o(Z,w) = (z, Qw), 
where (-, +) is the complex bilinear form in (1.4). The set of 2 x2n complex matrices 
which preserve this form is the complex symplectic group Sp(n; C). A 2n x 2n 
complex matrix A is in Sp(n; C) if and only if (1.9) holds. (Note: This condition 
involves A”, not A*.) Again, we can easily show that each A € Sp(n; C) satisfies 
det A = +1 and, again, it is actually the case that det A = 1. Finally, we have the 
compact symplectic group Sp(7) defined as 
Sp(n) = Sp (n; ©) N U(2n). 


That is to say, Sp(n) is the group of 2n x 2n matrices that preserve both the inner 
product and the bilinear form w. For more information about Sp(v), see Sect. 1.2.8. 


1.2.5 The Euclidean and Poincaré Groups 


The Euclidean group E(7) is the group of all transformations of R” that can be 
expressed as a composition of a translation and an orthogonal linear transformation. 
We write elements of E(n) as pairs {x, R} with x € R” and R € O(n), and we let 
{x, R} act on R” by the formula 
{x, R} y = Ry+ x. 
Since 
{x1, Ri}{xe, Roty = Ri(Roy + x2) + xı = Ri Roy + (xı + Ri x2), 


the product operation for E(n) is the following: 


{x1, Ri}{x2, Ro} = {x1 + Rix2, Ri Ro}. (1.10) 
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The inverse of an element of E(n) is given by 
{x, RY! = (Rx, R"}. 
The group E(n) is not a subgroup of GL(n; R), since translations are not linear 


maps. However, E(n) is isomorphic to the (closed) subgroup of GL(n + 1; R) 
consisting of matrices of the form 


x] 
R if, (1.11) 
0-01 
with R € O(n). (The reader may easily verify that matrices of the form (1.11) 
multiply according to the formula in (1.10).) 
We similarly define the Poincaré group P(n; 1) (also known as the inhomoge- 
neous Lorentz group) to be the group of all transformations of R”*! of the form 


T=T,A 


with x € R”+! and A € O(n; 1). This group is isomorphic to the group of (n + 2) x 
(n + 2) matrices of the form 


A: (1.12) 


with A € O(n; 1). 


1.2.6 The Heisenberg Group 


The set of all 3 x 3 real matrices A of the form 


lab 
A=1]0lc ], (1.13) 
001 


where a, b, and c are arbitrary real numbers, is the Heisenberg group. It is easy to 
check that the product of two matrices of the form (1.13) is again of that form, and, 
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clearly, the identity matrix is of the form (1.13). Furthermore, direct computation 
shows that if A is as in (1.13), then 


1 —a ac — b 
A'=]0 1 -c 
0 0 1 


Thus, H is a subgroup of GL(3; R). The Heisenberg group is a model for the 
Heisenberg—Weyl commutation relations in physics and also serves as a illumi- 
nating example for the Baker-Campbell-Hausdorff formula (Sect. 5.2). See also 
Exercise 8. 


1.2.7 The Groups R*, C*, S1, R, and R” 


Several important groups which are not defined as groups of matrices can be 
thought of as such. The group R* of non-zero real numbers under multiplication is 
isomorphic to GL(1; R). Similarly, the group C* of nonzero complex numbers under 
multiplication is isomorphic to GL(1; C) and the group S! of complex numbers with 
absolute value one is isomorphic to U(1). 

The group R under addition is isomorphic to GL(1; R)* (1 x 1 real matrices with 
positive determinant) via the map x — [e*]. The group R” (with vector addition) is 
isomorphic to the group of diagonal real matrices with positive diagonal entries, via 
the map 


1.2.8 The Compact Symplectic Group 


Of the groups introduced in the preceding subsections, the compact symplectic 
group Sp(n) := Sp(n;C) N U(2n) is the most mysterious. In this section, we 
attempt to understand the structure of Sp(n) and to show that it can be understood 
as being the “unitary group over the quaternions.” 

Since the definition of Sp(n) involves unitarity, it is convenient to express the 
bilinear form œ on C” in terms of the inner product (-, -), rather than in terms of the 
bilinear form (-,-), as we did in Sect. 1.2.4. To this end, define a conjugate-linear 
map J : C” > C” by 


J(a, B) = (-B,&), 
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where œ and £ are in C” and (œ, 8) is in C*”. We can easily check that for all 
z,w € C”, we have 


o(z,w) = (Jz,w). 


Recall that we take our inner product to be conjugate linear in the first factor; since 
J is also conjugate linear, (Jz, w) is actually linear in z. We may easily check that 


(Jz,w) = —(z, Jw) = — (Jw, z) 
for all z, w € C2” and that 


Js, 


Proposition 1.5. Jf U belongs to U(2n) then U belongs to Sp(n) if and only if U 
commutes with J. 


Proof. Fix some U in U(2n). Then for z and w in C”, we have, on the one hand, 
w(Uz, Uw) = (JUz, Uw) = (U*JUz, w) = (U-'JUz, w), 
and, on the other hand, 
w(z,w) = (Jz,w). 
From this it is each to check that U preserves w if and only if 
UT'JU =J, 


which is equivalent to JU = UJ. o 


The preceding result can be used to give a different perspective on the definition 
of Sp(n), as follows. The quaternion algebra H is the four-dimensional associative 
algebra over R spanned by elements 1 (the identity), i, j, and k satisfying 


and 
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We may realize the quaternion algebra inside M>(C) by taking identifying 1 with 
the identity matrix and setting 


= (95): I= (io): «= Go). 


The algebra H is then the space of real linear combinations of J, i, j, and k. 
Now, since J is conjugate linear, we have 


J(iz) = —iJz 
for all z € C”"; that is, iJ = —Ji. Thus, if we define K to be iJ, we have 
K? = ii] = —J(iY J = J? = -I, 
and one can easily check that iJ, J, and K satisfy the same commutation relations 


as i, j, and k. We can therefore make C?” into a “vector space” over the 
noncommutative algebra H by setting 


i-z=iz 
j:-z= Jz 
k-z = iz. 


Now, if U belongs to Sp(n), then U commutes with multiplication by i and with 
J (Proposition 1.5) and thus, also, with K := iJ. Thus, U is actually “quaternion 
linear.’ A 2n x2n matrix U therefore belongs to Sp(n) if and only if U is quaternion 
linear and preserves the norm. Thus, we may think of Sp(n) as the “unitary group 
over the quaternions.” The compact symplectic group then fits naturally with the 
orthogonal groups (norm-preserving maps over R) and the unitary groups (norm- 
preserving maps over C). 

Every U e U(2n) has an orthonormal basis of eigenvectors, with eigenvalues 
having absolute value 1. We now determine the additional properties the eigenvec- 
tors and eigenvalues must satisfy in order for U to be in Sp(n) = U(2n)NSp(n; ©). 


Theorem 1.6. Jf U € Sp(n), then there exists an orthonormal basis uy,..., Un, 
Vi, ... Vn for C” such that the following properties hold: First, Juj = vj; second, 
for some real numbers 0,,..., On, we have 

id 


eee 
Uu; = eu; 


Uv; = ergs 
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and third, 
w(uj, uk) = W(0;, vk) = 0 
w(uj, UK) = Ojk. 


Conversely, if there exists an orthonormal basis with these properties, U belongs to 
Sp(n). 


Lemma 1.7. Suppose V is a complex subspace of C2" that is invariant under the 
conjugate-linear map J. Then the orthogonal complement V+ of V (with respect 
to the inner product (-,-)) is also invariant under J. Furthermore, V and V+ are 
orthogonal with respect to w; that is, 


o(z,w) = 0 
forallz € V andwe V+. 
Proof. If w € V+ then for all z € V, we have 
(Jw, z) = — (Jz, w) = 0, 


because Jz is again in V. Thus, V+ is invariant under J. Then if z € V andw € vt, 
we have 


ow(z,w) = (Jz,w) = 0, 


because Jz is again in V. o 


Proof of Theorem 1.6. Consider U in Sp(n; C) N U(2n), choose an eigenvector for 
U, normalized to be a unit vector, and call it u;. Since U preserves the norms of 
vectors, the eigenvalue À; for u; must be of the form ei’ for some 6; € R. If we set 
vı = J uy, then since J is conjugate linear and commutes with U (Proposition 1.5), 
we have 


Uv, = J(Um) = J(e” u) = ey. 
That is to say, vı is an eigenvector for U with eigenvalue e~'®'. Furthermore, 
(vi u) = (Ju, u) = w(u1, u1) = 0, 
since w is a skew-symmetric form. On the other hand, 
w(u, v1) = (Ju, vi) = (Jui, Ju) = 1, 


since J preserves the magnitude of vectors. 
Now, since J? = —J, we can easily check that the span V of u; and v; = Ju, 
is invariant under J. Thus, by Lemma 1.7, V+ is also invariant under J and is 
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w-orthogonal to V. Meanwhile, V is invariant under both U and U* = U™!. 
Thus, by Proposition A.10, V+ is invariant under both U** = U and U*. Since 
U preserves V+, the restriction of U to V+ will have an eigenvector, which we 
can normalize to be a unit vector and call u2. If we let v = Jum, then we 
have all the same properties for uz and v2 as for uw; and vı. Furthermore, u) and 
V2 are orthogonal—with respect to both (-,-) and w(-,-)—to u; and vı. We can 
then proceed on in a similar fashion to obtain the full set of vectors u1,..., Uy 
and v1,..., Un. (If u1,..., uk and vj,..., Ug have been chosen, we take uz+, and 
Ug41 t= Jug+, in the orthogonal complement of the span of wj,...,u,% and 
Uj,...5 Uk.) 

The other direction of the theorem is left to the reader (Exercise 6). oO 


1.3 Topological Properties 


In this section, we investigate three important topological properties of matrix Lie 
groups, each of which is satisfied by some groups but not others. 


1.3.1 Compactness 


The first property we consider is compactness. 


Definition 1.8. A matrix Lie group G C GL(n; C) is said to be compact if it is 
compact in the usual topological sense as a subset of M,(C) = R2’, 


In light of the Heine—Borel theorem (Theorem 2.41 in [Rud1]), a matrix Lie 
group G is compact if and only if it is closed (as a subset of M, (C), not just as a 
subset of GL(n;C)) and bounded. Explicitly, this means that G is compact if and 
only if (1) whenever A,, € G and Am — A, then A is in G, and (2) there exists a 
constant C such that for all A € G, we have Ajx| <C forall 1 < j,k <n. 

The following groups are compact: O(n) and SO(n), U(n) and SU(7), and 
Sp(n). Each of these groups is easily seen to be closed in M,,(C) and each satisfies 
the bound | Ax| < 1, since in each case, the columns of A € G are required to 
be unit vectors. Most of the other groups we have considered are noncompact. The 
special linear group SL(n; R), for example, is unbounded (except in the trivial case 
n = 1), because for all m, the matrix 


has determinant one. 
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The second property we consider is connectedness. 


Definition 1.9. A matrix Lie group G is said to be connected if for all A and B in 
G, there exists a continuous path A(t), a < t < b, lying in G with A(a) = A and 
A(b) = B. For any matrix Lie group G, the identity component of G, denoted Go, 
is the set of A € G for which there exists a continuous path A(t), a < t < b, lying 
in G with A(a) = J and A(b) = A. 


The property we have called “connected” in Definition 1.9 what is called path 
connected in topology, which is not (in general) the same as connected. However, 
we will eventually prove that a matrix Lie group is connected if and only if it is 
path-connected. Thus, in a slight abuse of terminology, we shall continue to refer to 
the above property as connectedness. (See the remarks following Corollary 3.45.) 

To show that a matrix Lie group G is connected, it suffices to show that each 
A € G can be connected to the identity by a continuous path lying in G. 


Proposition 1.10. If G is a matrix Lie group, the identity component Go of G is a 
normal subgroup of G. 


We will see in Sect. 3.7 that Go is closed and hence a matrix Lie group. 


Proof. If A and B are any two elements of Go, then there are continuous paths A(t) 
and B(t) connecting J to A and to B in G. Then the path A(t) B(t) is a continuous 
path connecting J to AB in G, and (A(t))~! is a continuous path connecting 7 to 
A`! in G. Thus, both AB and A`! belong to Go, showing that Go is a subgroup of 
G. Now suppose A is in Go and B is any element of G. Then there is a continuous 
path A(t) connecting J to A in G, and the path BA(t)B~! connects J to BAB“! in 
G. Thus, BAB™! € Go, showing that Go is normal. oO 


Note that because matrix multiplication and matrix inversion are continuous on 
GL(n; C), it follows that if A(t) and B(t) are continuous, then so are A(t) B(t) 
and A(t)~!. The continuity of the matrix product is obvious. The continuity of 
the inverse follows from the formula for the inverse in terms of cofactors; this 
formula is continuous as long as we remain in the set of invertible matrices where 
the determinant in the denominator is nonzero. 


Proposition 1.11. The group GL(n; C) is connected for alln > 1. 


Proof. We make use of the result that every matrix is similar to an upper triangular 
matrix (Theorem A.4). That is to say, we can express any A € M,,(C) in the form 
A = CBC_', where 
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If A is invertible, each À ; must be nonzero. Let B(t) be obtained by multiplying the 
part of B above the diagonal by (1 — t), for 0 < t < 1, and let A(t) = CB(t)C™. 
Then A(t) is a continuous path lying in GL(n;C) which starts at A and ends at 
CDC!, where D is the diagonal matrix with diagonal entries 41,...,4,. We can 
now define paths A ; (t) connecting 4; to 1 in C* as ¢ goes from 1 to 2, and we can 
define A(t) on the interval 1 < t < 2 by 


A(t) 0 
A(t) =C A cH. 
0 An(t) 
Then A(t), 0 < t < 2, is a continuous path in GL(n; C) connecting A to 7. o 


An alternative proof of this result is given in Exercise 12. 
Proposition 1.12. The group SL(n; C) is connected for alln > 1. 


Proof. The proof is almost the same as for GL(n; C), except that we must make 
sure our path connecting A € SL(n;C) to J lies entirely in SL(n; C). We can 
ensure this by choosing À, (t), in the second part of the preceding proof, to be equal 


to (Ai (t) +++ An—i(t)) I. o 
Proposition 1.13. The groups U(n) and SU(n) are connected, for alln > 1. 


Proof. By Theorem A.3, every unitary matrix has an orthonormal basis of eigen- 
vectors, with eigenvalues having absolute value 1. Thus, each U € U(n) can be 
written as U,DU;", where U, € U(n) and D is diagonal with diagonal entries 
ee! We may then define 

ei(—0)01 0 


U(t) =U; Pe U, 0<t<1. 
0 ei (1-0) 


It is easy to see that U(t) is in U(n) for all t, and U(t) connects U to J. A slight 


modification of this argument, as in the proof of Proposition 1.12, shows that SU (n) 
is connected. o 


The group SO(n) is also connected; see Exercise 13. 


1.3.3 Simple Connectedness 


The last topological property we consider is simple connectedness. 
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Definition 1.14. A matrix Lie group G is said to be simply connected if it is 
connected and, in addition, every loop in G can be shrunk continuously to a point 
inG. 

More precisely, assume that G is connected. Then G is simply connected if for 
every continuous path A(t), 0 < t < 1, lying in G and with A(0) = A(1), there 
exists a continuous function A(s,t), 0 < s,t < 1, taking values in G and having 
the following properties: (1) A(s,0) = A(s, 1) for all s, (2) A(0, t) = A(t), and (3) 
A(1,t) = ACL, 0) for all t. 


One should think of A(t) as a loop and A(s, t) as a family of loops, parameterized 
by the variable s which shrinks A(t) to a point. Condition 1 says that for each 
value of the parameter s, we have a loop; Condition 2 says that when s = 0 the 
loop is the specified loop A(t); and Condition 3 says that when s = 1 our loop 
is a point. The condition of simple connectedness is important because for simply 
connected groups, there is a particularly close relationship between the group and 
the Lie algebra. (See Sect. 5.7.) 


Proposition 1.15. The group SU(2) is simply connected. 


Proof. Exercise 5 shows that SU(2) may be thought of (topologically) as the 
three-dimensional sphere $° sitting inside R. It is well known that $° is simply 
connected; see, for example, Proposition 1.14 in [Hat]. oO 


If a matrix Lie group G is not simply connected, the degree to which it fails to 
be simply connected is encoded in the fundamental group of G. (See Sect. 13.1.) 
Sections 13.2 and 13.3 analyze several additional examples. It is shown there, for 
example, that SU(7) is simply connected for all n. 


1.3.4 The Topology of SO(3) 


We conclude this section with an analysis of the topological structure of the group 
SO(3). We begin by describing real projective spaces. 


Definition 1.16. The real projective space of dimension n, denoted RP”, is the 
set of lines through the origin in R”®!, Since each line through the origin intersects 
the unit sphere exactly twice, we may think of RP” as the unit sphere S” with 
“antipodal” points u and —u identified. 


Using the second description, we think of points in RP” as pairs {u, —u}, with 
u € S”. There is a natural map m : S” — RP”, given by 


z(u) = {u, —u}. 
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We may define a distance function on RP” by defining 


d({u, —u}, {v, —v}) = min(d(u, v), d (u, —v), d(—u, v), d(—u, —v)) 
= min(d(u, v), d (u, —v)). 


(The second equality holds because d(x, y) = d(—x,—y).) With this metric, RP” 
is locally isometric to S”, since if u and v are nearby points in S”, we have 
d({u, —u}, {v, —v}) = d(u, v). 

It is known that RP” is not simply connected. (See, for example, Example 1.43 
in [Hat].) Indeed, suppose u is any unit vector in R’*! and B(t) is any path in S” 
connecting u to —u. Then 


A(t) := a (B®) 


is a loop in RP”, and this loop cannot be shrunk continuously to a point in RP”. 
To prove this claim, suppose that a map A(s, ¢) as in Definition 1.14. Then A(s, t) 
can be “lifted” to a continuous map B(s,t) into S” such that B(0,t) = B(t) and 
such that A(s,t) = 2(B(s,t)). (See Proposition 1.30 in [Hat].) Since A(s,0) = 
A(s, 1) for all s, we must have B(s,0) = +B(s, 1). But by construction, B(0, 0) = 
—B(0, 1). If order for B(s,t) to be continuous in s, we must then have B(s,0) = 
— B(s, 1) for all s. It follows that B(1, t) is a nonconstant path in S”. It is then easily 
verified that A(1,t) = z(B(1,t)) cannot be constant, contradicting our assumption 
about A(s, t). 

Let D” denote the closed upper hemisphere in S$”, that is, the set of points u € S” 
with un+1 > 0. Then x maps D” onto RP", since at least one of u and —u is in 
D”. The restriction of x to D” is injective except on the equator, that is, the set of 
u € S” with un+ı = O. If u is in the equator, then —u is also in the equator, and 
zx(—u) = x(u). Thus, we may also think of RP” as the upper hemisphere D”, with 
antipodal points on the equator identified (Figure 1.2). 

We may now make one last identification using the projection P of R’*! onto 
R”. (That is to say, P is the map sending (x1,...,Xn,Xn+1) to (%1,...,%X,).) The 
restriction of P to D” is a continuous bijection between D” and the closed unit ball 
B” in R”, with the equator in D” mapping to the boundary of the ball. Thus, our 


=u 


Fig. 1.2 The space RP” is the upper hemisphere with antipodal points on the equator identified. 
The indicated path from u to —u corresponds to a loop in RP” that cannot be shrunk to a point 
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last model of RP” is the closed unit ball B” C R”, with antipodal points on the 
boundary of B” identified. 
We now turn to a topological analysis of SO(3). 


Proposition 1.17. There is a continuous bijection between SO(3) and RP?. 


Since RP? is not simply connected, it follows that SO(3) is not simply 
connected, either. 


Proof. If v is a unit vector in R?, let R,9 be the element of SO(3) consisting of 
a “right-handed” rotation by angle @ in the plane orthogonal to v. That is to say, 
let vt denote the plane orthogonal to v and choose an orthonormal basis (u1, u2) 
for vt in such a way that the linear map taking the orthonormal basis (u1, u2, v) 
to the standard basis (e1, e2, e3) has positive determinant. We use the basis (u1, u2) 
to identify v+ with R?, and the rotation is then in the counterclockwise direction 
in R°. It is easily seen that R_,9 is the same as R, —o. It is also not hard to show 
(Exercise 14) that every element of SO(3) can be expressed as R, g, for some v and 
0 with —z < 0 < x. Furthermore, we can arrange that 0 < 0 < z by replacing v 
with —v if necessary. 

If R = J, then R = R, 9 for any unit vector v. If R is a rotation by angle a about 
some axis v, then R can be expressed both as Ry „ and as R_,.,. It is not hard to see 
that if R ~ J and R is not a rotation by angle x, then R has a unique representation 
as Ryo withO < 0 < v. 

Now let B? denote the closed ball of radius m in R? and consider the map Ẹ : 
B? —> SO(3) given by 


Du) = Raju, «#0, 
(0) = 7. 


Here, ù = u/ ||u|| is the unit vector in the u-direction. The map ® is continuous, even 
at I, since R, approaches the identity as 6 approaches zero, regardless of how v 
is behaving. The discussion in the preceding paragraph shows that ® maps B? onto 
SO(3). The map ® is injective except that “antipodal” points on the boundary of B? 
have the same image: R,, = R_y,,. Thus, ® descends to a continuous, injective 
map of RP? onto SO(3). Since both RP? and SO(3) are compact, Theorem 4.17 
in [Rud1] tells us that the inverse map is also continuous, meaning that SO(3) is 
homeomorphic to R P?. o 


For a different approach to proving Proposition 1.17, see the discussion following 
Proposition 1.19. 


1.4 Homomorphisms 


We now look at the notion of homomorphisms for matrix Lie groups. 
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Definition 1.18. Let G and H be matrix Lie groups. A map ® from G to H is 
called a Lie group homomorphism if (1) ® is a group homomorphism and (2) ® 
is continuous. If, in addition, ® is one-to-one and onto and the inverse map 7! is 
continuous, then ® is called a Lie group isomorphism. 


The condition that ® be continuous should be regarded as a technicality, in that it 
is very difficult to give an example of a group homomorphism between two matrix 
Lie groups which is not continuous. In fact, if G = R and H = C%, then any group 
homomorphism from G to H which is even measurable (a very weak condition) 
must be continuous. (See Exercise 17 in Chapter 9 of [Rud2].) 

Note that the inverse of a Lie group isomorphism is continuous (by definition) 
and a group homomorphism (by elementary group theory), and thus a Lie group 
isomorphism. If G and H are matrix Lie groups and there exists a Lie group 
isomorphism from G to H, then G and H are said to be isomorphic, and we write 
GH. 

The simplest interesting example of a Lie group homomorphism is the determi- 
nant, which is a homomorphism of GL(n; C) into C*. Another simple example is 
the map ® : R > SO(2) given by 


cos —sin 0 
2h) e a 


This map is clearly continuous, and calculation (using standard trigonometric 
identities) shows that it is a homomorphism. 

An important topic for us will be the relationship between the groups SU(2) and 
SO(3), which are almost, but not quite, isomorphic. Specifically, we now show that 
there exists a Lie group homomorphism ® : SU(2) —> SO(3) that is two-to-one and 
onto. Consider the space V of all 2 x 2 complex matrices X which are self-adjoint 
(i.e., X* = X) and have trace zero. Elements of V are precisely the matrices of the 
form 


x=( Xi pata: (1.14) 


X= 1X3 —xX1 


with x1, x2, x3 € R. If we identify V with R? by means of the coordinates x1, x2, 
and x3 in (1.14), then the standard inner product on R? can be computed as 


1 
(Xi, X) = gtrace(X1X2). 
That is to say, 


1 x x2 + ix x xh + ix, 
-trace i 3 E ge 
2 X2 — 1X3 —X] Xy — ix; —xX] 


7 i f 
= XX] + X2X3 + X3X3, 
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as one may easily check by direct calculation. 
For each U € SU(2), define a linear map ®y : V > V by 


®y (X) = UXU”. 
Since U is unitary, 
(UAU~!)* = (U')*AU* = UAUM!," 


showing that UAU”! is again in V. 
It is easy to see that Pyu, = By, By,. Furthermore, 


1 1 
5 trace((UX1U~')(UX2U~')) = xtrace(UX1 X2U~') 
1 
= 5 trace(X1X2), 


since the trace is invariant under conjugation. Thus, each ®y preserves the inner 
product trace(X; X2)/2 on V. It follows that the map U > ®y is a homomorphism 
of SU(2) into the group of orthogonal linear transformations of V = R?, that is, into 
O(3). Since SU(2) is connected (Proposition 1.13), ®y must actually lie in SO(3) 
for all U € SU(2). Thus, ® (i.e., the map U +» ®y) is a homomorphism of SU(2) 
into SO(3), which is easily seen to be continuous. Since (-[)X(—I)~! = X, we 
see that ®_; is the identity element of SO(3). 

Suppose, for example, that U is the matrix 


ei 9/2 0 
U= ( 0 pin): 


Then by direct calculation, we obtain 


u( ae ar a ge (1.15) 
Xo — ix; =X x5 — ix, xi 


where x{ = xı and 


x5 + ix} = et? (x2 + ix3) 
= (x2 cos 0 — x3 sin 0) + i (x2 sin 0 + x3 cos 0). (1.16) 
In this case, then, ®y is a rotation by angle @ in the (x2, x3)-plane. Note that even 


though the diagonal entries of U are e+'°/?, the map ®y is a rotation by angle 0, 
not 0/2. 
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Proposition 1.19. The map U |> ®y is a 2-1 and onto map of SU(2) to SO(3), 
with kernel equal to {I, —T}. 


Since SU(2) is homeomorphic to S°, the proposition gives another way of seeing 
that SO(3) is homeomorphic to RP?, that is, S? with antipodal points identified. 
This result was obtained in a different way in Proposition 1.17. 

It is not hard to show that ® is a “covering map” in the topological sense 
(Section 1.3 of [Hat]). Since SU(2) is simply connected (Proposition 1.15) and 
the map is 2-1, it follows by the theory of covering maps (e.g., Theorem 1.38 in 
[Hat]) that SO(3) cannot be simply connected and, indeed, it must have fundamental 
group Z/2. See Chapter 13 for general information about fundamental groups and 
for a computation of the fundamental group of SO(n), n > 2. 


Proof. Exercise 16 shows that the kernel of ® is precisely the set {7,—J}. To see 
that ® maps onto SO(3), let R be a rotation of V = R?. By Exercise 14, there exists 
an “axis” X € V such that R is a rotation by some angle @ in the plane orthogonal 
to X. If we express X in the form 


X1 (0) —1 
X =U, U, 
a(o a) : 


with Up € U(2), then the plane orthogonal to X in V is the space of matrices of the 
form 


x’ = Uo uy mee Ni (1.17) 
X2 — 1X3 0 


If we now take 


ei? 0 z 
v=v( 0 a) 


we can easily see that UXU —! — X. On the other hand, the calculations in (1.15) 
and (1.16) show that UX’U~! is of the same form as in (1.17), but with (x2, x3) 
rotated by angle 0. Thus, ®y is a rotation by angle 0 in the plane perpendicular to 
X, showing that ®y coincides with R. oO 


It is possible, if not terribly useful, to calculate ® explicitly. If you write an 
element of SU(2) as in Exercise 5, you (or your computer) may calculate 


(eee EF) 


/ / of 
= ( x, y+ >) 
EE / tal 1 

x} ix =x} 
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explicitly. Then (x{, x5, x4) will depend linearly on (x1, x2, x3) and you can express 
(x1, x}, x4) as a matrix applied to (x1, x2, x3), with the result that 


la|? — ||? —2Re(f) — 21m(aB) 
by = | 2Re(æß) Re(a2 — b?) Im(p? — a?) 
2Im(aB) Im(a2 + 6?) Re(a? + B?) 


If we take œ = e'9/? and B = 0, we may see directly that ®y is a rotation by angle 
0 in the (x2, x3)-plane, as we saw already in (1.15) and (1.16). 


1.5 Lie Groups 


A Lie group is a smooth manifold equipped with a group structure such that the 
operations of group multiplication and inversion are smooth. As the terminology 
suggests, every matrix Lie group is a Lie group. (See Corollary 3.45 in Chapter 3.) 
The reverse is not true: Not every Lie group is isomorphic to a matrix Lie group. 
Nevertheless, we have restricted our attention in this book to matrix Lie groups, 
in order to minimize prerequisites and keep the discussion as concrete as possible. 
Most of the interesting examples of Lie groups are, in any case, matrix Lie groups. 
A manifold is an object M that looks locally like a piece of R”. More precisely, 
an n-dimensional manifold is a second-countable, Hausdorff topological space with 
the property that each m € M has a neighborhood that is homeomorphic to an open 
subset of IR”. A two-dimensional torus, for example, looks locally but not globally 
like R? and is, thus, a two-dimensional manifold. A smooth manifold is a manifold 
M together with a collection of local coordinates covering M such that the change- 
of-coordinates map between two overlapping coordinate systems is smooth. 


Definition 1.20. A Lie group is a smooth manifold G which is also a group and 
such that the group product 


GxG~>G 


and the inverse map G —> G are smooth. 


Example 1.21. Let 
G=RxRxS'={(x,y,w|x eR ye Rwe S! CcC}, 
equipped with the group product given by 
(x1, Yi, M1) + (X2, Y2, U2) = (X1 + X2, Yı + Yo, "Puy U2). 


Then G is a Lie group. 
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Proof. It is easily checked that this operation is associative; the product of three 
elements with either grouping is 


(x1 + x2 + x3, y1 + Y2 + y3, ef y2 tx y3 +293) y uuz). 


There is an identity element in G, namely e = (0,0, 1) and each element (x, y, u) 
has an inverse given by (—x, —y, eu~ !). Thus, G is, in fact, a group. Furthermore, 
both the group product and the map that sends each element to its inverse are clearly 
smooth, showing that G is a Lie group. oO 


Although there is nothing about matrices in the definition of G, we may still ask 
whether G is isomorphic to some matrix Lie group. This turns out to be false. As 
shown in Sect. 4.8, there is no continuous, injective homomorphism of G into any 
GL(n; C). We conclude, then, that not every Lie group is isomorphic to a matrix Lie 
group. Nevertheless, most of the interesting examples of Lie groups are matrix Lie 
groups. 

Let us now think briefly about how we might show that every matrix Lie group is 
a Lie group. We will prove in Sect. 3.7 that every matrix Lie group is an “embedded 
submanifold” of M, (R) = R’. The operations of matrix multiplication and 
inversion are smooth on M,,(C) (after restricting to the open subset of invertible 
matrices in the case of inversion). Thus, the restriction of these operations to a matrix 
Lie group G C M,(C) is also smooth, making G into a Lie group. 

It is customary to call a map ® between two Lie groups a Lie group homo- 
morphism if ® is a group homomorphism and ® is smooth, whereas we have 
(in Definition 1.18) required only that ® be continuous. We will show, however, 
that every continuous homomorphism between matrix Lie groups is automatically 
smooth, so that there is no conflict of terminology. See Corollary 3.50 to Theo- 
rem 3.42. Finally, we note that since every matrix Lie group G is a manifold, G 
must be locally path connected. It then follows by a standard topological argument 
that G is connected if and only if it is path connected. 


1.6 Exercises 


1. Let [+,-],,4 be the symmetric bilinear form on R”+* defined in (1.5). Let g be 
the (n + k) x (n +k) diagonal matrix with first n diagonal entries equal to one 
and last k diagonal entries equal to minus one: 


Show that for all x, y € Rt, 


[x, Vn = (x, 8y). 
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Show that a (n + k) x (n + k) real matrix A belongs to O(n; k) if and only if 
gA"g = AT! . 


. Let w be the skew-symmetric bilinear form on R?” given by (1.7). Let Q be the 


2n x 2n matrix 


eo. 


Show that for all x, y € R2”, we have 


w(x, y) = (x, Qy). 


Show that a 2n x 2n matrix A belongs to Sp(n; R) if and only if -QA"Q = 
AM, 
Note: A similar analysis applies to Sp(n; C). 


. Show that the symplectic group Sp(1;R) C GL(2;R) is equal to SL(2; R). 


Show that Sp(1;C) = SL(2; C) and that Sp(1) = SU(2). 


. Show that a matrix R belongs to SO(2) if and only if it can be expressed in the 


form 
cos — sin 0 
sin cos 


for some 0 € R. Show that a matrix R belongs to O(2) if and only if it is of 
one of the two forms: 


A= cos — sin 0 E = cos sin 
~ \ sinf cosé ~ \ sin@ —cos0 J` 


Hint: Recall that for A to be in O(2), the columns of A must be orthonormal. 


. Show that if œ and are arbitrary complex numbers satisfying |a|? + |6|? = 1, 


then the matrix 


GD 


is in SU(2). Show that every A € SU(2) can be expressed in this form for a 
unique pair (œ, B) satisfying |a|* + |8| = 1. 


. Suppose U belongs to Mz, (C) and U has an orthonormal basis of eigenvectors 


satisfying the conditions in Theorem 1.6. Show that U belongs to Sp(n). 
Hint: Start by showing that U is unitary. Then show that w(Uz, Uw) = œ(z, w) 
if z and w belong to the basis u1,...,Un, V1, ..., Un. 


. Using Theorem 1.6, show that Sp(7) is connected and that every element of 


Sp(n) has determinant 1. 


28 


10. 


11. 


12. 


13. 
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. Determine the center Z(H) of the Heisenberg group H. Show that the quotient 


group H/Z(#) is commutative. 


. Suppose a is an irrational real number. Show that the set E, of numbers of the 


form e?7'"“ n e€ Z, is dense in the unit circle S!. 


Hint: Show that if we divide S! into N equally sized “bins” of length 27/N, 
there is at least one bin that contains infinitely many elements of E4. Then use 
the fact that E, is a subgroup of S!. 

Let a be an irrational real number and let G be the following subgroup of 


GL(2; ©): 


Show that 


e= [(4 bjese] 


where G denotes the closure of the set G inside the space of 2 x 2 matrices. 
Hint: Use Exercise 9. 
A subset E of a matrix Lie group G is called discrete if for each A in E there 
is a neighborhood U of A in G such that U contains no point in E except for 
A. Suppose that G is a connected matrix Lie group and N is a discrete normal 
subgroup of G. Show that N is contained in the center of G. 
This problem gives an alternative proof of Proposition 1.11, namely that 
GL(n;C) is connected. Suppose A and B are invertible n x n matrices. 
Show that there are only finitely many complex numbers À for which 
det(AA + (1—A)B) = 0. Show that there exists a continuous path A(t) 
of the form A(t) = A(t)A + (1 —A(t))B connecting A to B and such that A(t) 
lies in GL(n; C). Here, A(t) is a continuous path in the plane with A(0) = 0 
and A(1) = 1. 
Show that SO(n) is connected, using the following outline. 
For the case n = 1, there is nothing to show, since a 1 x 1 matrix with 
determinant one must be [1]. Assume, then, that n > 2. Let e; denote the unit 
vector with entries 1,0,...,0 in R”. For every unit vector v € R”, show that 
there exists a continuous path R(t) in SO(n) with R(O) = J and R(1)v = ey. 
(Thus, any unit vector can be “continuously rotated” to e4.) 
Now, show that any element R of SO(n) can be connected to a block-diagonal 
matrix of the form 
1 
( Ri ) 


with Rı € SO(n — 1) and proceed by induction. 
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14. If R is an element of SO(3), show that R must have an eigenvector v with 
eigenvalue 1. Show that R maps the plane orthogonal to v into itself. Conclude 
that R is a rotation by some angle @ around the “axis” v. 

Hint: Since SO(3) C SU(3), every (real or complex) eigenvalue of R must 
have absolute value 1. Since, also, R is real, any nonreal eigenvalues of R come 
in conjugate pairs. 

15. Let R be an element of SO(n). 


(a) Suppose v € C” is an eigenvector for R with eigenvalue A € C, and suppose 
À is not real. Let V C R” be the two-dimensional span of (v + ọ)/2 and 
(v — v)/(2i). Show that V is invariant under R and that the restriction of R 
to V has determinant 1. 

(b) Suppose that a subspace V C R” is invariant under both R and R~!. Show 
that the orthogonal complement V+ of V is also invariant under both R and 
R. 

(c) Show that if n = 2k, there exists S € SO(n) such that 


cos 6; — sin 6; 
sin@; cos; 
R=S a sT! 
cos 0; — sin Ox 
sin ôy cos Ox 


and that if n = 2k + 1, there exists $ € SO(n) such that 


cos 0; — sin 6; 
sin; cos, 


S 


a 
II 
n~ 


cos 6 — sin 0; 
sinO, cos & 
1 


That is, in a suitable orthonormal basis, R is block diagonal with 2 x2 blocks 
of the indicated form, with a single 1 x 1 block if n is odd. 


Hint: Show that the number of eigenvalues of R equal to —1 is even. 
16. (a) Show that if a matrix A commutes with every matrix X of the form (1.14), 
then A commutes with every element of M2(C). Conclude that A must be 
a multiple of the identity. 
(b) Show that the kernel of the map U +> ®y in Proposition 1.19 is precisely 
the set {7, —J}. 
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17. 


18. 
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Suppose G C GL(nı; C) and H C GL(n2; C) are matrix Lie groups and that 
® : G — H isa Lie group homomorphism. Then the image of G under ® is a 
subgroup of H and thus of GL(72; C). Is the image of G under ® necessarily a 
matrix Lie group? Prove or give a counter-example. 

Show that every continuous homomorphism ® from R to S! is of the form 
(x) = e for some a € R. 

Hint: Since ® is continuous, there is some ¢ > 0 such that if |x| < £, then B(x) 
belongs to the right half of the unit circle. 


Chapter 2 
The Matrix Exponential 


2.1 The Exponential of a Matrix 


The exponential of a matrix plays a crucial role in the theory of Lie groups. The 
exponential enters into the definition of the Lie algebra of a matrix Lie group 
(Sect. 3.3) and is the mechanism for passing information from the Lie algebra to 
the Lie group. 

If X is ann x n matrix, we define the exponential of X, denoted e* or exp X, 
by the usual power series 


Say. (2.1) 


where X° is defined to be the identity matrix J and where X” is the repeated matrix 
product of X with itself. 


Proposition 2.1. The series (2.1) converges for all X € M,(C) and e* is a 
continuous function of X. 


Our proof will use the notion of the norm of a matrix X € M,,(C), which we 
define by thinking of M,,(C) as Cc”. 


Definition 2.2. For any X € M,,(C), we define 


1/2 


IX =[ >> |X 


jk=1 


The quantity || X || is called the Hilbert-Schmidt norm of X. 
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The Hilbert-Schmidt norm may be computed in a basis-independent way as 


|| X || = (trace(X* X))!/?. (2.2) 
(See Sect. A.6.) This norm satisfies the inequalities 


|X+Y¥]] < IXI + IYI, (2.3) 

|XY|] < IXY I (2.4) 

for all X,Y €e M,(C). The first of these inequalities is the triangle inequality for 
C”? and the second follows from the Cauchy—Schwarz inequality (Exercise 1). If 


Xm is a sequence of matrices, then it is easy to see that Xm converges to a matrix X 
in the sense of Definition 1.3 if and only if | Xm — X || > 0 asm — oo. 


Proof of Proposition 2.1. In light of (2.4), we see that 
IX” < IXI” 


forall m > 1, and, hence, 


oe) 


D 


m=0 


co m 
|X| 
<(7+> 0 a < 


m=1 


X m 


m! 


Thus, the series (2.1) converges absolutely. 

To show continuity, note that since X” is a continuous function of X, the partial 
sums of (2.1) are continuous. By the Weierstrass M-test, the series (2.1) converges 
uniformly on each set of the form {|| X || < R}. Thus, e* is continuous on each such 
set, and, thus, continuous on all of M, (C). oO 


We now list some elementary properties of the matrix exponential. 


Proposition 2.3. Let X and Y be arbitrary n x n matrices. Then we have the 
following: 


1 =]. 

2. Gak = e*", 

3. e* is invertible and (eX) =e%, 

4, e@tP)X — eX ePX for all w and B in C. 

5. If XY = YX, then e*t+¥ = eX e¥ = e e“. 
6. If C is invertible, then eC XC — CeXC-, 


Although e* +” = eXe” when X and Y commute, this identity fails in general. 
This is an important point, which we will return to in the Lie product formula in 
Sect. 2.4 and the Baker-Campbell—Hausdorff formula in Chapter 5. 


Proof. Point 1 is obvious and Point 2 follows from taking term-by-term adjoints 
of the series for e*. Points 3 and 4 are special cases of Point 5. To verify Point 5, 
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we simply multiply the two power series term by term, which is permitted because 
both series converge absolutely. Multiplying out e* e’ and collecting terms where 
the power of X plus the power of Y equals m, we obtain 


œo m yk yimn-k oo m 


1 m! 
eX =) >>> k! (m—k)! 2 ini 2 in a Ge) 


m=0k=0 


Now, because (and only because) X and Y commute, 


m 


x+y =>) 


k=0 


m! konk 
—— ykyrk, 
k!(m— k)! 


and, thus, (2.5) becomes 


CO 
1 
X,Y m X+Y 
— y Y+yy"= : 
ee ‘it ) e 


m=0 


To prove Point 6, simply note that 


m 


(CXC) =cx*c~ 


and, thus, the two sides of Point 6 are equal term by term. o 


Proposition 2.4. Let X be an x n complex matrix. Then e™ is a smooth curve in 
M,,(C) and 


ar = Xe” eX. 
dt 


In particular, 


d 
a =X. 
dt |;=0 


Results that hold for the exponential of numbers may or may not hold for the 
matrix exponential. Although Proposition 2.4 is what one would expect from the 
scalar case, it should be noted that, in general, the derivative of et is not equal 
to e¥ +Y., See Sect. 5.4. 


Proof. Differentiate the power series for e% term by term. This is permitted because, 
for each j and k, (e)a is given by a convergent power series in f, and one can 
differentiate a power series term by term inside its radius of convergence (e.g., 
Theorem 12 in Chapter 4 of [Pugh]). oO 
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2.2 Computing the Exponential 


We consider here methods for exponentiating general matrices. A special method 
for exponentiating 2 x 2 matrices is described in Exercises 6 and 7. Suppose that 
X e M,C) has n linearly independent eigenvectors vj,...,U, with eigenvalues 
Ài,..., An. Let C be the n x n matrix whose columns are vj,...,U, and let D 
be the diagonal matrix with diagonal entries A,,...,A,. Then X = CDC™!. It is 
easily verified that e? is the diagonal matrix with diagonal entries e*',... , eĉ, and 
thus, by Proposition 2.3, we have 


Meanwhile, if X is nilpotent (i.e., X k = 0 for some k), then the series that 
defines e* terminates. Finally, according to Theorem A.6, every matrix X can be 
written (uniquely) in the form X = S +N, with S diagonalizable, N nilpotent, and 
SN = NS. Then, since N and S commute, 


Then 
om cosa —sina 
sind cosa 
and 


lab+ac/2 
em =101 c 
00 1 


X% e“ eb 
i a: 


and 


2.2 Computing the Exponential 35 


Proof. The eigenvectors of X, are (1,7) and (i, 1), with eigenvalues —ia and ia, 
respectively. Thus, 


“(NE YEE) 


which simplifies to the claimed result. Meanwhile, X 5 has the value ac in the upper 
right-hand corner and all other entries equal to zero, whereas X. = 0. Thus, e% = 
1+X: + xX? /2, which reduces to the claimed result. Finally, 


a0 0b 
= (52) +(b0): 


where the two terms clearly commute and the second term is nilpotent. Thus, we 


obtain 
e% et 0 1b , 
0 ef 01 


which reduces to the claimed result. oO 


The matrix exponential is used the elementary theory of differential equations, to 
solve systems of linear equations. Consider a first-order differential equation of the 
form 


d 
GY = Xv, 
dt 

v(0) = vo, 


where v(t) € R” and X is a fixed n xn matrix. The (unique) solution of this equation 
is given by 


v(t) = e*vo, 
as may be easily verified using Proposition 2.4. Curves of the form ¢ +> evo, with 


Vo fixed, trace out the flow along the vector field v œ> Xv. 
Let us consider the two matrices 


12). fi2 
Xe= (57): ks= (37). (2.6) 


Figure 2.1 plots several curves of this form for each matrix (see Exercise 8). 
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WW) WAY VG 
I ZN 
See of Vi aN 


Fig. 2.1 Curves of the form t > evo (left) and t œ> evo (right) 


2.3 The Matrix Logarithm 


We now wish to define a matrix logarithm, which should be an inverse function 
(to the extent possible) to the matrix exponential. Let us recall the situation for the 
logarithm of complex numbers, in order to see what is reasonable to expect in the 
matrix case. Since e% is never zero, only nonzero numbers can have a logarithm. 
Every nonzero complex number can be written as e? for some z, but the z is not 
unique and cannot be defined continuously on C*. In the matrix case, e* is invertible 
for all X € M,,(C). We will see (Theorem 2.10) that every invertible matrix can be 
written as e*, for some X € M,,(C), but the X is not unique. 

The simplest way to define the matrix logarithm is by a power series. We recall 
how this works in the complex case. 


Lemma 2.6. The function 
[e0] 
c=" 
logz = -1n o 2.7 
ogz 2 ) = (2.7) 


is defined and analytic in a circle of radius I about z = 1. 
For all z with |z — 1| < 1, 


elosz 


= 
For all u with |u| < log 2, we have |e“ — 1| < 1 and 


loge” = u. 
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Proof. The usual logarithm for real, positive numbers satisfies 


fi (1—x) (l+x+x7+---) 
— log(1 — x) = —— = — X+x° 4:5: 
dx . 1-x 
for |x| < 1. Integrating term by term and noting that log 1 = 0 gives 
2 wd 
log(1—x)=-(x+¥4+44--), 


Taking z = 1 — x (so that x = 1 — z), we have 


_ 2 3 
loge=-(a-a+5 2 a 2 +) 


= = _ mee=))" 
= CD os (2.8) 


m=1 


The series (2.8) has radius of convergence | and defines a holomorphic function 
on the set {|z — 1| < 1}, which coincides with the usual logarithm for real z in the 
interval (0, 2). Now, exp(log z) = z for z € (0, 2) and since both sides of this identity 
are holomorphic in z, the identity continues to hold on the whole set {|z — 1| < 1}. 

On the other hand, if |u| < log 2, then 


2 
pki kalap ey slr <1. 


u 
e" -1| = 
2! 


Thus, log(expu) makes sense for all such u. Since log(expu) = u for real u with 
|u| < log2, it follows by holomorphicity that log(expu) = u for all complex 
numbers with |u| < log 2. o 


Definition 2.7. For an n x n matrix A, define log A by 


log A = yae (2.9) 


m=1 
whenever the series converges. 


Since the complex-valued series (2.7) has radius of convergence | and since 
(A — Ty" || < || A—T|"" for m > 1, the matrix-valued series (2.9) will converge 
if || A—J|| < 1. Even if || A — I || > 1, the series might converge, for example, if 
A — I is nilpotent (see Exercise 9). 
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Theorem 2.8. The function 


log A = = Da ice a 


m=1 


(A-— ry" 
m 


is defined and continuous on the set of all n x n complex matrices A 
with |A — I| < 1. 
For all A with |A — I || < 1, 


e84 = A. 
—I|| < 1 and 
loge* = X. 


Although it might seem plausible that log(e* ) should be equal to X whenever 
the series for the logarithm is convergent, this claim is false (even over C). If, for 
example, X = 2xil, then e* = e™'I = I. Thene* — I = 0, so that log(e*) 
is defined and equal to 0. In this case, log(e*) is defined but not equal to X. Thus, 
the assumption that || X || < log2 cannot be replaced by, say, the assumption that 
le¥ -I| <1. 


Proof. Since ||(A— I)’"|| < ||(A—)||” and since the series (2.7) has radius of 
convergence 1, the series (2.9) converges absolutely for all A with |A —J|| < 1. 
The proof of continuity is essentially the same as for the exponential. 

Suppose now that A satisfies || A — Z || < 1. If A is diagonalizable with eigenvalue 
ZI; <.: Zn, then we can express A in the form CDC™! with D diagonal, in which 
case 


(zı _ 1)” 0 
(A-D"™=C So co. 
0 (Zn = 1)” 


Since || A — I || < 1, each eigenvalue z; of A must satisfy [zi — 1| < 1 (Exercise 2). 
Thus, 


logzı 0 

oO m 

Soy =C s co, 
_ m ` 

m=1 0 log Zn 
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and by Lemma 2.6, 


elogzi 0 
e084- C n. C7! = A. 


0 eles Zn 


If A is not diagonalizable, we approximate A by a sequence Am of diagonalizable 
matrices (Exercise 4) and appeal to the continuity of the logarithm and exponential 
functions. Thus, exp(log A) = A for all A with ||A — J || < 1. 

Now, the same argument as in the complex case shows that if || X || < log 2, then 
lex —I | < 1. The proof that log(e*) = X is then very similar to the proof that 
exp(log A) = A. Oo 


Proposition 2.9. There exists a constant c such that for all n x n matrices B with 
IBI < 3 


llog + B) -— B|| < c |B]?. 


Proof. Note that 


[0,6] 
Br- —2 
log + B)-B =} (- Ti = B? Se pee 
m=2 m=2 
so that 
log + B) — B|| < 
m=2 
which is an estimate of the desired form. oO 


We may restate the proposition in a more concise way by saying that 
log + B) = B + O(||B|®, 


where O(||B||*) denotes a quantity of order || B|]? G.e., a quantity that is bounded 
by a constant times || B ||? for all sufficiently small values of || B ||). 

We conclude this section with a result that, although we will not use it elsewhere, 
is worth recording. The proof is sketched in Exercises 9 and 10. 


Theorem 2.10. Every invertible n x n matrix can be expressed as e* for some 
X € M, (©). 
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2.4 Further Properties of the Exponential 


In this section, we give several additional results involving the exponential of a 
matrix that will be important in our study of Lie algebras. 


Theorem 2.11 (Lie Product Formula). For all X,Y € M,(C), we have 


There is a version of this result, known as the Trotter product formula, which 
holds for suitable unbounded operators on an infinite-dimensional Hilbert space. 
See, for example, Theorem 20.1 in [Hall]. 


Proof. If we multiply the power series for e m and em , all but three of the terms will 
involve 1/m? or higher powers of 1/m. Thus, 


xX Y xX Y 1 
emem = Į + + +0 z). 
m m m 


P xX Y + A er ae A e 
Now, since emem — I asm — oo, emen is in the domain of the logarithm for all 
sufficiently large m. By Proposition 2.9, 


X Y 1 
log (even) = log G +—+— +0 (=)) 
m m m 


X Y X Y 1 
-Z+F+0(|F +5 +0(5) 
m m m m 


m 


) 


l 
|> 
+ 
[x 
+ 
QO 
os 
as 


Exponentiating the logarithm then gives 


x Yx X Y 1 

emem = exp + + O 

m m m 

x yN™ 1 
(enem) =exp(xX¥+Y+O(—)]. 

m 


Thus, by the continuity of the exponential, we conclude that 


and, therefore, 


lim (emen)” =exp(X +Y), 


m—->oo 


which is the Lie product formula. o 
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Recall (Sect. A.5) that the trace of matrix is defined as the sum of its diagonal 
entries and that similar matrices have the same trace. 


Theorem 2.12. For any X € M, (C), we have 
det (e*) — eltace(X) 


Proof. If X is diagonalizable with eigenvalues 1, . . . , Àn, then e* is diagonalizable 


with eigenvalues e”',... , e^. Thus, trace(X) = pI A; and 


det(e*) = e^ -eð = ett tran — eltace(X) 


If X is not diagonalizable, we can approximate it by matrices that are diagonalizable 
(Exercise 4). o 


Definition 2.13. A function A : R —> GL(n;C) is called a one-parameter 
subgroup of GL(n; C) if 


1. A is continuous, 
2. A(0) = J, 
3. A(t + s) = A(t)A(s) for allt,s € R. 


Theorem 2.14 (One-Parameter Subgroups). If A(-) is a one-parameter subgroup 
of GL(n; C), there exists a unique n x n complex matrix X such that 


A(t) = e™. 


By taking n = 1, and noting that GL(1;C) = C*, this theorem provides a 
method of solving Exercise 18 in Chapter 1. 


Lemma 2.15. Fix some £ with € < log 2. Let Bg;z be the ball of radius ¢/2 around 
the origin in M,,(C), and let U = exp(Be/2). Then every B € U has a unique 
square root C in U, given by C = exp(5 log B). 


Proof. It is evident that C is a square root of B and that C is in U. To establish 
uniqueness, suppose C’ € U satisfies (C^)? = B. Let Y = log C’; then exp(Y) = 
C’ and 


exp(2Y) = (C? = B = exp(log B). 


We have that Y € B,/2 and, thus, 2Y € B,, and also that log B € Bey C Be. Since, 
by Theorem 2.8, exp is injective on B, and exp(2Y) = exp(log B), we must have 
2Y = log B. Thus, C’ = exp(; log B) = C. Oo 


Proof of Theorem 2.14. The uniqueness is immediate, since if there is such an X, 
then X = £A(t)|_o: To prove existence, let U be as in Lemma 2.15, which is an 
open set in GL(n; C). The continuity of A guarantees that there exists tọ > 0 such 
that A(t) € U for all t with |t| < to. Define 
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1 
X = —log(A()), 
to 


so that oX = log(A(to)). Then toX € Be/z and 
eX — A(to). 


Now, A(t/2) is again in U and A(to/2)* = A(t). But by Lemma 2.15, A(tọ) has 
a unique square root in U, and that unique square root is exp(to X /2). Thus, 


A(to/2) = exp(toX/2). 
Applying this argument repeatedly, we conclude that 
A(to/2*) = exp(toX/2") 
for all positive integers k. Then for any integer m, we have 
A(mty/2*) = A(to/2*)" = exp(mtoX /2%). 


It follows that A(t) = exp(tX) for all real numbers ¢ of the form £ = mto/2*, and 
the set of such t’s is dense in R. Since both exp(tX) and A(t) are continuous, it 
follows that A(t) = exp(fX) for all real numbers t. o 


Proposition 2.16. The exponential map is an infinitely differentiable map of M, (C) 
into M, (C). 


We will compute the derivative of the matrix exponential in Chapter 5. 


Proof. Note that foreach j and k, the quantity (X™);ję is a homogeneous polynomial 
of degree m in the entries of X . Thus, the series for the function (X™”);g has the form 
of a multivariable power series on M, (C) = R?’ , Since the series converges on all 
of R?” it is permissible to differentiate the series term by term as many times as 
we like. (Apply Theorem 12 in Chapter 4 of [Pugh] in each of the n? variables with 
the other variables fixed.) o 


2.5 The Polar Decomposition 


The polar decomposition for a nonzero complex number z states that z can be written 
uniquely as z = up, where |u| = 1 and p is real and positive. (If z = 0, the 
decomposition still exists, with p = 0, but u is not unique.) Since p is real and 
positive, it can be written as p = e~ for a unique real number x. This gives an 
unconventional form of the polar decomposition for z, namely 
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=w, (2.10) 


with x € R and |u| = 1. Although it is customary to leave p as a positive real 
number and to write u as u = e!®, the decomposition in (2.10) is more convenient 
for us because x, unlike 0, is unique. 

We wish to establish a similar polar decomposition first for GL(n; C) and then 
for various subgroups thereof. If P is a self-adjoint n x n matrix (i.e., P* = P), we 
say that P is positive if (v, Pv) > 0 for all nonzero v € C”. It is easy to check that 
a self-adjoint matrix P is positive if and only if all the eigenvalues of P are positive. 
Suppose now that A is an invertible n x n matrix. We wish to write A as A = UP 
where U is unitary and P is self-adjoint and positive. We will then write the self- 
adjoint, positive matrix P as P = e* where X is self-adjoint but not necessarily 
positive. 


Theorem 2.17. 


1. Every A € GL(n; C) can be written uniquely in the form 
A= UP 


where U is unitary and P is self-adjoint and positive. 
2. Every self-adjoint positive matrix P can be written uniquely in the form 


P=e*% 


with X self-adjoint. Conversely, if X is self-adjoint, then e* is self-adjoint and 
positive. 
3. If we decompose each A € GL(n; C) (uniquely) as 


A = Ue* 


with U unitary and X self-adjoint, then U and X depend continuously on A. 


Lemma 2.18. If Q is a self-adjoint, positive matrix, then Q has a unique positive, 
self-adjoint square root. 


Proof. Since Q has an orthonormal basis of eigenvectors, Q can be written as 
Ay 
Q=U ss U`! 
Àn 


with U unitary. Since Q is self-adjoint and positive, each À; is positive. Thus, we 
can construct a square root of Q as 
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31/2 
1 
ol? =U ae Ua}, (2.11) 
31/2 


and Q!/? will still be self-adjoint and positive, establishing the existence of the 
square root. 

If P is a self-adjoint, positive matrix, the eigenspaces of P? are precisely the 
same as the eigenspaces of P, with the eigenvalues of P? being, of course, the 
squares of the eigenvalues of P . The point here is that because the function x > x? 
is injective on positive real numbers, eigenspaces with distinct eigenvalues remain 
with distinct eigenvalues after squaring. Looking at this claim the other way around, 
if a positive, self-adjoint matrix Q is to have a positive self-adjoint square root P, 
the eigenspaces of P must be the same as the eigenspaces of Q, and the eigenvalues 
of P must be the positive square roots of the eigenvalues of Q. Thus, P is uniquely 
determined by Q. Oo 


Proof of Theorem 2.17. For the existence of the decomposition in Point 1, note that 
if A = UP, then A*A = PU*UP = P?. Now, for any matrix A, the matrix A* A 
is self-adjoint. If, in addition, A is invertible, then for all nonzero v € C”, we have 


(v, A* Av) = (Av, Av) > 0, 
showing that A is positive. For all invertible A, then, let us define P by 
P = (A*A)!?, 
where (-)!/? is the unique positive square root of Lemma 2.18. We then define 
U = AP™! = A[(A* A)T. 


Since P is, by construction, self-adjoint and positive, and since A = UP by the 
definition of U, it remains only to check that U is unitary. To that end, we check 
that 


U*U — [(4* A)! | aaa APT}, 


since the inverse of a positive self-adjoint matrix is self-adjoint. Since A* A is the 
square of (A* A)!/?, we see that U*U = 1, showing that U is unitary. 

For the uniqueness of the decomposition, we have already noted that if A = UP, 
then P? = A*A, where A*A is self-adjoint and positive. Thus, the uniqueness of 
P follows from the uniqueness in Lemma 2.18. The uniqueness of U then follows, 
since if A = UP, then U = AP™!. 

The existence and uniqueness of the decomposition in Point 2 are proved in 
precisely the same way as in Lemma 2.18, with the logarithm function (which is 
a bijection between (0, oo) and R) replacing the square root function. The same sort 
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of reasoning shows that for any self-adjoint X, the matrix e* 


positive. 

Finally, we address the continuity claim in Point 3. From the formulas for P and 
U in terms of A in the proof of Point 1, we see that U and P depend continuously 
on A. It remains only to show that the logarithm X of P depends continuously 
on P. To see this, note that if the eigenvalues of P are between 0 and 2, then the 
power series for log P will converge to X, in which case, continuity follows by 
the same argument as in the proof of Proposition 2.1. In general, fix some positive, 
self-adjoint matrix Pp. Choose some large positive number a, and for P in a small 
neighborhood V of Py, write P = e“(e~“ P). Then P = e*, where 


is self-adjoint and 


X =al + log(e™ P). 
Since a is large, the eigenvalues of e~“ P will all be less than log 2, and the series 
for log(e~* P) will converge and depend continuously on P. o 
We now establish polar decompositions for GL(n; R), SL(n; C), and SL(n; R). 
Proposition 2.19. 
1. Every A € GL(n; R) can be written uniquely as 


A= Re’, 


where R is in O(n) and X is real and symmetric. 
2. Every A € SL(n;C) can be written uniquely as 


A= Ue*, 


where U is in SU(n) and X is self-adjoint with trace zero. 
3. Every A € SL(n;R) can be written uniquely as 


A=Re*, 


where R is in SO(n) and X is real and symmetric and has trace zero. 


Proof. If A is real, then A* A is real and symmetric. Now, a real, symmetric matrix 
can be diagonalized over R. Thus, P, which is the unique self-adjoint positive 
square root of A*A (constructed as in (2.11)), is real. Then U = AP"! is real 
and unitary, hence in O(n). 

Meanwhile, if A € SL(n; C) and we write A = Ue* with U € U(n) and X self- 
adjoint, then det(A) = det(U)e*), Now, |det(U)| = 1, and e*°* is real and 
positive. Thus, by the uniqueness of the polar decomposition for nonzero complex 
numbers, we must have det(U) = 1 and trace(X) = 0. The case of A € SL(n; R) 
follows by combining the arguments in the two previous cases. o 
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2.6 Exercises 


1. The Cauchy—Schwarz inequality from elementary analysis tells us that for all 
u = (uw,...,U,) and v = (v),...,U,) in C”, we have 


n n 
2 
uivi +++ + Unal < X Je;| (Eme). 
k=1 


j=l 


Use this to verify that ||XY|| < || X|| ||Y || for all X,Y € M,,(C), where ||-|| is 
the Hilbert-Schmidt norm in Definition 2.2. 

2. Show that for X € M,(C) and any orthonormal basis {u1,..., un} of C”, 
IXI? = Dki |(u;j, X ux) ? where || X|| is as in Definition 2.2. Now show 
that if v is an eigenvector for X with eigenvalue A, then |A| < ||X|]. 

3. The product rule. Recall that a matrix-valued function A(t) is said to be smooth 
if each Aj,(¢) is smooth. The derivative of such a function is defined as 


(4) -4 
dt}, dt 


or, equivalently, 


d 
—A(t) = li 
dt ©) od 


—0 


A(t +h) — A(t) 
ac Ta 


Let A(t) and B(t) be two such functions. Prove that A(t) B(t) is again smooth 
and that 


d dA dB 
g LOLO = zO +4 


4. Using Theorem A.4, show that every n x n complex matrix A is the limit of a 
sequence of diagonalizable matrices. 
Hint: If an n x n matrix has n distinct eigenvalues, it is necessarily 
diagonalizable. 

5. For any a and d in C, define the expression (e — e?) /(a — d) in the obvious 
way for a 4 d and by means of the limit 


when a = d. Show that for any a,b,c € C, we have 
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iz ab\_ [e peu 
Plod 0 e@ | 


Hint: Show that if a 4 d, then 


a b m B a” pea" 
0d -o0 pm 
for every positive integer m. 
6. Show that every 2 x 2 matrix X with trace(X) = 0 satisfies 


X? = —det(X)I. 


If X is 2 x 2 with trace zero, show by direct calculation using the power series 
for the exponential that 


in ydet X 

A (va X) 7 SEVA 

e cos et + X, (2.12) 
det X 


where ydet X is either of the two (possibly complex) square roots of det X . Use 
this to give an alternative computation of the exponential e*! in Example 2.5. 
Note: The value of the coefficient of X in (2.12) is to be interpreted as 1 when 
det X = 0, in accordance with the limit limg_,o sin 0/0 = 1. 

7. Use the result of Exercise 6 to compute the exponential of the matrix 


we ( 4 J 
-12 
Hint: Reduce the calculation to the trace-zero case. 

8. Consider the two matrices X4 and X; in (2.6). Compute e™ and e™> either by 
diagonalization or by the method in Exercises 6 and 7. Show that curves of the 
form t +> eyo, with vo 0, spiral out to infinity. Show that for vo outside of 
a certain one-dimensional subspace of R?, curves of the form t > e™>vo tend 
to infinity in the direction of (1, 1) or the direction of (—1,—1). 

9. A matrix A is said to be unipotent if A — I is nilpotent (i.e., if A is of the 
form A = I + N, with N nilpotent). Note that log A is defined whenever A is 
unipotent, because the series in Definition 2.7 terminates. 


(a) Show that if A is unipotent, then log A is nilpotent. 

(b) Show that if X is nilpotent, then e* is unipotent. 

(c) Show that if A is unipotent, then exp(log A) = A and that if X is nilpotent, 
then log(exp X) = X. 


Hint: Let A(t) = I +t(A—J). Show that exp(log(A(t))) depends polynomially 
on żź and that exp(log(A(t))) = A(t) for all sufficiently small ¢. 
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10. Show that every invertible n x n matrix A can be written as A = e* for some 


11. 


X €M,(C). 
Hint: Theorem A.5 implies that A is similar to a block-diagonal matrix in which 
each block is of the form AJ + Ny, with N, being nilpotent. Use this result and 


Exercise 9. 
Show that for all X € M,,(C), we have 


lim E + | =¢. 
m—>oo m 


Hint: Use the matrix logarithm. 


Chapter 3 
Lie Algebras 


3.1 Definitions and First Examples 


We now introduce the “abstract” notion of a Lie algebra. In Sect. 3.3, we will 
associate to each matrix Lie group a Lie algebra. It is customary to use lowercase 
Gothic (Fraktur) characters such as g and b to refer to Lie algebras. 


Definition 3.1. A finite-dimensional real or complex Lie algebra is a finite- 
dimensional real or complex vector space g, together with a map [-,-] from g x g 
into g, with the following properties: 


1. [-,-] is bilinear. 
2. [-,-] is skew symmetric: [X, Y] = —[Y, X] forall X,Y € g. 
3. The Jacobi identity holds: 


[X, IY, Z]] + IY, [Z, X]] + [Z. [X,Y] = 0 


forall X, Y, Z € g. 


Two elements X and Y of a Lie algebra g commute if [X, Y] = 0. A Lie algebra 
g is commutative if [X, Y] = 0 for all X,Y € g. 


The map [-,-] is referred to as the bracket operation on g. Note also that 
Condition 2 implies that [X, X] = 0 for all X € g. The bracket operation on a 
Lie algebra is not, in general associative; nevertheless, the Jacobi identity can be 
viewed as a substitute for associativity. 


Example 3.2. Let g = R? and let [-, -] : R? x R? > R? be given by 
[x.y] =x x y, 
where x x y is the cross product (or vector product). Then g is a Lie algebra. 
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Proof. Bilinearity and skew symmetry are standard properties of the cross product. 
To verify the Jacobi identity, it suffices (by bilinearity) to verify it when x = ej, 
y = ex, and z = e;, where e1, e2, and e3 are the standard basis elements for R?. 
If j, k, and / are all equal, each term in the Jacobi identity is zero. If j, k, and l 
are all different, the cross product of any two of e;, eg, and e; is equal to a multiple 
of the third, so again, each term in the Jacobi identity is zero. It remains to consider 
the case in which two of j,k,/ are equal and the third is different. By re-ordering 
the terms in the Jacobi identity as necessary, it suffices to verify the identity 


[e;,le;,ex]] + le;, lex, es]] + lex, lej, e;]l. (3.1) 


The first two terms in (3.1) are negatives of each other and the third is zero. oO 


Example 3.3. Let A be an associative algebra and let g be a subspace of A such that 
XY—YX € gforall X, Y € g. Then g is a Lie algebra with bracket operation given by 


[X,Y] = XY — YX. 


Proof. The bilinearity and skew symmetry of the bracket are evident. To verify the 
Jacobi identity, note that each double bracket generates four terms, for a total of 12 
terms. It is left to the reader to verify that the product of X, Y, and Z in each of the 
six possible orderings occurs twice, once with a plus sign and once with a minus 
sign. o 


If we look carefully at the proof of the Jacobi identity, we see that the two 
occurrences of, say, XYZ occur with different groupings, once as X(Y Z) and once 
as (XY) Z. Thus, associativity of the algebra A is essential. For any Lie algebra, the 
Jacobi identity means that the bracket operation behaves as if it were XY — YX in 
some associative algebra, even if it is not actually defined this way. Indeed, we will 
prove in Chapter 9 that every Lie algebra g can be embedded into an associative 
algebra A in such a way that the bracket becomes XY — YX. (This claim follows 
from Theorem 6.7, the Poincaré—Birkhoff—Witt theorem.) 

Of particular interest to us is the case in which A is the space M, (C) of all n x n 
complex matrices. 


Example 3.4. Let sl(n;C) the space X € M,(C) for which trace(X) = 0. Then 
sl(n; C) is a Lie algebra with bracket [X, Y] = XY — YX. 


Proof. For any X and Y in M,,(C), we have 
trace(XY — YX) = trace(XY) — trace(YX) = 0. 
This holds, in particular, if X and Y have trace zero. Thus, Example 3.3 applies. O 


Definition 3.5. A subalgebra of a real or complex Lie algebra g is a subspace of 
g such that [H1, H2] € b for all Hı and M2 € b. If g is a complex Lie algebra and 
fh is a real subspace of g which is closed under brackets, then b is said to be a real 
subalgebra of g. 
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A subalgebra h of a Lie algebra g is said to be an ideal in g if [X, H] € b for all 
X in gand H inb. 

The center of a Lie algebra g is the set of all X € g for which [X, Y] = 0 for all 
Y €g. 


Definition 3.6. If g and h are Lie algebras, then a linear map ¢ : g > his called a 
Lie algebra homomorphism if ¢ ([X, Y]) = [ (X), (Y )] for all X,Y € g. If, in 
addition, @ is one-to-one and onto, then ¢ is called a Lie algebra isomorphism. 
A Lie algebra isomorphism of a Lie algebra with itself is called a Lie algebra 
automorphism. 


Definition 3.7. If g is a Lie algebra and X is an element of g, define a linear map 
ady : g > g by 


ady (Y) = [X,Y]. 


The map X > ady is the adjoint map or adjoint representation. 


Although adx (Y) is just [X, Y], the alternative “ad” notation can be useful. For 
example, instead of writing 


[X, [X, [X, [X, Y III]. 
we can now write 
(ady)* (Y). 

This sort of notation will be essential in Chapter 5. We can view ad (that is, the map 
X +> ady) as a linear map of g into End(g), the space of linear operators on g. 
The Jacobi identity is then equivalent to the assertion that ady is a derivation of the 
bracket: 

adx ([Y, Z]) = [adx (Y), Z] + [Y, adx(Z)]. (3.2) 
Proposition 3.8. [fg is a Lie algebra, then 


adix,Y] = ady ady = adyady = lady, ady]; 


that is, ad: g —> End(g) is a Lie algebra homomorphism. 


Proof. Observe that 
adjx,y|(Z) = [[X, Y], Z], 
whereas 


[ady, ady ](Z) = [X, [Y, Z]] — [Y, [X, ZJ]. 
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Thus, we want to show that 
[[X, Y], Z] = [X, [¥, Z]] - [¥, [X, Z]], 


which is equivalent to the Jacobi identity. oO 


Definition 3.9. If gı and gz are Lie algebras, the direct sum of gı and gp is the 
vector space direct sum of gı and g2, with bracket given by 


(X1, X2), (V1, Y2)] = ([X1, Yı], [X2, Y2)). (3.3) 


If g is a Lie algebra and gı and g2 are subalgebras, we say that g decomposes as 
the Lie algebra direct sum of gı and gp if g is the direct sum of gı and go as vector 
spaces and [X,, X2] = 0 for all X; € gı and X2 € gp. 


It is straightforward to verify that the bracket in (3.3) makes gı ® go into a Lie 
algebra. If g decomposes as a Lie algebra direct sum of subalgebras g and gp, it is 
easy to check that g is isomorphic as a Lie algebra to the “abstract” direct sum of gı 
and go. (This would not be the case without the assumption that every element of gı 
commutes with every element of go.) 


Definition 3.10. Let g be a finite-dimensional real or complex Lie algebra, and let 


X\,...,Xw bea basis for g (as a vector space). Then the unique constants cj; such 
that 
N 
[Xj, X] = Se ca X1 (3.4) 


l=1 


are called the structure constants of g (with respect to the chosen basis). 


Although we will not have much occasion to use them, structure constants 
do appear frequently in the physics literature. The structure constants satisfy the 
following two conditions: 


Ciki + Cy = 9, 


) (CjknCnim + CkinCnjm a ClinCnkm) =0 


n 


for all j,k,1,m. The first of these conditions comes from the skew symmetry of the 
bracket, and the second comes from the Jacobi identity. 
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3.2 Simple, Solvable, and Nilpotent Lie Algebras 


In this section, we consider various special types of Lie algebras. Recall from 
Definition 3.5 the notion of an ideal in a Lie algebra. 


Definition 3.11. A Lie algebra g is called irreducible if the only ideals in g are g 
and {0}. A Lie algebra g is called simple if it is irreducible and dim g > 2. 


A one-dimensional Lie algebra is certainly irreducible, since it is has no 
nontrivial subspaces and therefore no nontrivial subalgebras and no nontrivial 
ideals. Nevertheless, such a Lie algebra is, by definition, not considered simple. 

Note that a one-dimensional Lie algebra g is necessarily commutative, since 
[aX,bX] = 0 for any X € g and any scalars a and b. On the other hand, 
if g is commutative, then any subspace of g is an ideal. Thus, the only way a 
commutative Lie algebra can be irreducible is if it is one dimensional. Thus, an 
equivalent definition of “simple” is that a Lie algebra is simple if it is irreducible 
and noncommutative. 

There is an analogy between groups and Lie algebras, in which the role of 
subgroups is played by subalgebras and the role of normal subgroups is played by 
ideals. (For example, the kernel of a Lie algebra homomorphism is always an ideal, 
just as the kernel of a Lie group homomorphism is always a normal subgroup.) There 
is, however, an inconsistency in the terminology in the two fields. On the group side, 
any group with no nontrivial normal subgroups is called simple, including the most 
obvious example, a cyclic group of prime order. On the Lie algebra side, by contrast, 
the most obvious example of an algebra with no nontrivial ideals—namely, a one- 
dimensional algebra—is not called simple. 

We will eventually see many examples of simple Lie algebras, but for now 
we content ourselves with a single example. Recall the Lie algebra sl(m;C) in 
Example 3.4. 


Proposition 3.12. The Lie algebra sl(2; C) is simple. 
Proof. We use the following basis for sI(2; C): 


01 00 1 0 
x= ae Y= >; H= i 
(5 0) ( 1 3) i -1 ) 
Direct calculation shows that these basis elements have the following commutation 
relations: [X, Y] = H, [H, X] = 2X, and [H, Y] = —2Y. Suppose b is an ideal in 
sl(2; C) and that h contains an element Z = aX + bH + cY, where a, b, and c are 


not all zero. We will show, then, that h = sl(2;C). Suppose first that c # 0. Then 
the element 


LX, [X, Z]] = [X, [-2bX + cH] = —2cX 
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is a nonzero multiple of X. Since h is an ideal, we conclude that X € b. But [Y, X] 
is a nonzero multiple of H and [Y, [Y, X]] is a nonzero multiple of Y , showing that 
Y and H also belong to h, from which we conclude that h = sl(2; C). 

Suppose next that c = 0 but b Æ 0. Then [X, Z] is a nonzero multiple of X 
and we may then apply the same argument in the previous paragraph to show that 
b = sl(2;C). Finally, if c = 0 and b = 0 but a Æ 0, then Z itself is a nonzero 
multiple of X and we again conclude that h = sl(2;C). o 


Definition 3.13. If g is a Lie algebra, then the commutator ideal in g, denoted 
Ig, g], is the space of linear combinations of commutators, that is, the space of 
elements Z in g that can be expressed as 


Z= c1[X1, Yı] SS + Cm[Xm, Yn] 


for some constants c; and vectors X;, Y; € g. 


For any X and Y in g, the commutator [X, Y] is in [g, g]. This holds, in particular, 
if X is in [g, g], showing that [g, g] is an ideal in g. 


Definition 3.14. For any Lie algebra g, we define a sequence of subalgebras 
Go. 91, 92,-.. of g inductively as follows: go = g, gı = [go. go], g2 = [g1. gı], 
etc. These subalgebras are called the derived series of g. A Lie algebra g is called 
solvable if g; = {0} for some j. 


In light of the comments following Definition 3.13, each derived algebra g; is an 
ideal in g ;—, but not necessarily an ideal in g. 


Definition 3.15. For any Lie algebra g, we define a sequence of ideals g/ in g 
inductively as follows. We set g? = g and then define g/*! to be the space of 
linear combinations of commutators of the form [X, Y] with X € g and Y € g’. 
These algebras are called the upper central series of g. A Lie algebra g is said to 
be nilpotent if g/ = {0} for some j. 


Equivalently, g/ is the space spanned by all jth-order commutators, 
[Xr Xo, 1X3... [Xj Xj] M. 


Note that every jth-order commutator is also a (j — 1)th-order commutator, by 
setting Š; = [X;, X;+1]. Thus, g/~' C g/. For every X € gand Y € g’, we have 
[X,Y] € g/t! C g/, showing that g/ is an ideal in g. Furthermore, it is clear that 
gj C g/ for all j; thus, if g is nilpotent, g is also solvable. 


Proposition 3.16. If g C M3(R) denotes the space of 3 x 3 upper triangular 
matrices with zeros on the diagonal, then g satisfies the assumptions of Example 3.3. 
The Lie algebra g is a nilpotent Lie algebra. 
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Proof. We will use the following basis for g, 


010 000 001 
X=[000]; Y=ļ|001|; Z=[000]. (3.5) 
000 000 000 


Direct calculation then establishes the following commutation relations: [X, Y] = 
Z and [X, Z] = [Y, Z] = 0. In particular, the bracket of two elements of g is again 
in g, so that g is a Lie algebra. Then [g, g] is the span of Z and [g, [g, g]] = 0, 
showing that g is nilpotent. o 


Proposition 3.17. If g C M2(C) denotes the space of 2 x 2 matrices of the form 


ab 

Oc 
with a, b, and c in C, then g satisfies the assumptions of Example 3.3. The Lie 
algebra g is solvable but not nilpotent. 


Proof. Direct calculation shows that 


ab de Oh 
: = ; 3.6 
n a = 
where h = ae + bf — bd — ce, showing that g is a Lie subalgebra of M2(C). Further- 


more, the commutator ideal [g, g] is one dimensional and hence commutative. Thus, 
g2 = {0}, showing that g is solvable. On the other hand, consider the following 


elements of g: 
H= 1 0 . x= 01 
0-1 00 


Using (3.6), we can see that [H, X] = 2X, and thus that 
[H, [H, [H, ---[H, X]---I]] 


is a nonzero multiple of X, showing that g’ Æ {0} for all j. o 


3.3 The Lie Algebra of a Matrix Lie Group 


In this section, we associate to each matrix Lie group G a Lie algebra g. Many 
questions involving a group can be studied by transferring them to the Lie algebra, 
where we can use tools of linear algebra. We begin by defining g as a set, and then 
proceed to give g the structure of a Lie algebra. 
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Definition 3.18. Let G be a matrix Lie group. The Lie algebra of G, denoted g, is 
the set of all matrices X such that e* is in G for all real numbers t. 


Equivalently, X is in g if and only if the entire one-parameter subgroup 
(Definition 2.13) generated by X lies in G. Note that merely having e* in G 
does not guarantee that X is in g. Even though G is a subgroup of GL(n; C) (and 
not necessarily of GL(n;R)), we do not require that e% be in G for all complex 
numbers f, but only for all real numbers t. We will show in Sect. 3.7 that every 
matrix Lie group is an embedded submanifold of GL(n; C). We will then show 
(Corollary 3.46) that g is the tangent space to G at the identity. 

We will now establish various basic properties of the Lie algebra g of a matrix 
Lie group G. In particular, we will see that there is a bracket operation on g that 
makes g into a Lie algebra in the sense of Definition 3.1. 


Proposition 3.19. Let G be a matrix Lie group, and X an element of its Lie algebra. 
Then e* is an element of the identity component Go of G. 


Proof. By definition of the Lie algebra, e lies in G for all real t. However, as t 
varies from 0 to 1, e is a continuous path connecting the identity to e*. oO 


Theorem 3.20. Let G be a matrix Lie group with Lie algebra g. If X and Y are 
elements of g, the following results hold. 


1. AXA! € g forall A €G. 

2. sX € g for all real numbers s. 
3. X+Yeg. 

4. XY—YX €g. 


It follows from this result and Example 3.3 that the Lie algebra of a matrix Lie 
group is a real Lie algebra, with bracket given by [X, Y] = XY — YX. For X and Y 
in g, we refer to [X, Y] = XY — YX € gas the bracket or commutator of X and Y. 


Proof. For Point 1, we observe that, by Proposition 2.3, 


et AXAT) _ AeX AT! e G 
for all t, showing that AXA! is in g. For Point 2, we observe that e’ (6X) = etx, 
which must be in G for all t € R if X is in g. For Point 3 we use the Lie product 
formula, which says that 


è m 
et tY) = lim (eme m) 
m—> o0 


Thus, (eX/meY/m\" is in G for all m. Since G is closed, the limit (which is 
invertible) must be again in G. This shows that X + Y is again in g. 

Finally, for Point 4, we use the product rule (Exercise 3) and Proposition 2.4 to 
compute 


3.4 Examples 57 


“ (e“Ye™)| = (XY)e° + (e°Y)(-X) 
= XY — YX. 


Now, by Point 1, e“ Ye~™ is in g for all t. Furthermore, by Points 2 and 3, g is a real 
subspace of M,,(C), from which it follows that g is a (topologically) closed subset 
of M,,(C). Thus, 


; eX Yehx —Y 
XY — YX = lim ———————— 
h—>0 h 


belongs to g. o 


Note that even if the elements of G have complex entries, the Lie algebra g of 
G is not necessarily a complex vector space, since Point 2 holds, in general, only 
for s € R. Nevertheless, it may happen in certain cases that g is a complex vector 
space. 


Definition 3.21. A matrix Lie group G is said to be complex if its Lie algebra g is 
a complex subspace of M, (C), that is, if iX € g forall X € g. 


Examples of complex groups are GL(n; C), SL(n; C), SO(n; C), and Sp(n; ©), 
as the calculations in Sect. 3.4 will show. 


Proposition 3.22. If G is commutative then g is commutative. 


We will see in Sect. 3.7 that if G is connected and g is commutative, G must be 
commutative. 


Proof. For any two matrices X, Y € M,(C), the commutator of X and Y may be 
computed as 
D 


If G is commutative and X and Y belong to g, then e* commutes with e°” and the 
expression in parentheses on the right hand side of (3.7) is independent of t, so that 
[X,Y] = 0. o 


d(d 
[X,Y]= — (Ferer (3.7) 
dt \ ds 


t=0 


3.4 Examples 


Physicists are accustomed to using the map t +> e** rather than t +> e™. Thus, 
the physicists’ expressions for the Lie algebras of matrix Lie groups will differ by a 
factor of i from the expressions we now derive. 
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Proposition 3.23. The Lie algebra of GL(n;C) is the space M,(C) of all n x n 
matrices with complex entries. Similarly, the Lie algebra of GL(n;R) is equal to 
M,,(R). The Lie algebra of SL(n; C) consists of all n x n complex matrices with 
trace zero, and the Lie algebra of SL(n; R) consists ofall n x n real matrices with 
trace zero. 


We denote the Lie algebras of these groups as gl(n;C), gl(7; R), sl(m; C), and 
sl(n; R), respectively. 


Proof. If X € M,(C), then e” is invertible, so that X belongs to the Lie algebra 
of GL(n; C). If X € M,(R), then e* is invertible and real, so that X is in 
the Lie algebra of GL(n;R). Conversely, if e is real for all real ¢, then X = 
d(e™%)/dt|,_. must also real. If X € M,,(C) has trace zero, then by Theorem 2.12, 
det(e*) = 1, showing that X is in the Lie algebra of SL(n; C). Conversely, if 
det(e™) = ee) — 1 for all real t, then 


Qs 
trace(X) = P trace(X) = 0. 
t=0 


Finally, if X is real and has trace zero, then e” is real and has determinant 1 for all 
real ż, showing that X is in the Lie algebra of SL(n; R). Conversely, if e™ is real 
and has determinant 1 for all real t, the preceding arguments show that X must be 
real and have trace zero. oO 


Proposition 3.24. The Lie algebra of U(n) consists of all complex matrices 
satisfying X* = —X and the Lie algebra of SU(n) consists of all complex matrices 
satisfying X* = —X and trace(X) = 0. The Lie algebra of the orthogonal group 
O(n) consists of all real matrices X satisfying X" = —X and the Lie algebra of 
SO(n) is the same as that of O(n). 


The Lie algebras of U(n) and SU(7) are denoted u(n) and su(7), respectively. 
The Lie algebra of SO(n) (which is the same as that of O(n)) is denoted so(7). 


Proof. A matrix U is unitary if and only if U* = U~!. Thus, e” is unitary if and 
only if 


(e*)* = (e%) = e. (3.8) 


By Point 2 of Proposition 2.3, (e'x)* = e“ and so (3.8) becomes 


e% ae (3.9) 
The condition (3.9) holds for all real ¢ if and only if ¥* = —X. Thus, the Lie 
algebra of U(n) consists precisely of matrices X such that ¥* = —X. As in the 


proof of Proposition 3.23, adding the “determinant 1” condition at the group level 
adds the “trace 0” condition at the Lie algebra level. 


3.4 Examples 59 


An exactly similar argument over R shows that a real matrix X belongs to the Lie 
algebra of O(n) if and only if X" = —X. Since any such matrix has trace(X) = 0 
(since the diagonal entries of X are all zero), we see that every element of the Lie 
algebra of O(n) is also in the Lie algebra of SO(n). o 


Proposition 3.25. If g is the matrix in Exercise 1 of Chapter 1, then the Lie algebra 
of O(n; k) consists precisely of those real matrices X such that 


gX"g =—X, 


and the Lie algebra of SO(n; k) is the same as that of O(n; k). If Q is the 
matrix (1.8), then the Lie algebra of Sp(n; R) consists precisely of those real 
matrices X such that 


QX"Q = X, 


and the Lie algebra of Sp(n; C) consists precisely of those complex matrices X 
satisfying the same condition. The Lie algebra of Sp(n) consists precisely of those 
complex matrices X such that QX"Q = X and X* = —X. 


The verification of Proposition 3.25 is similar to our previous computations and 
is omitted. The Lie algebra of SO(n; k) (which is the same as that of O(n; k)) is 
denoted so(n; k), whereas the Lie algebras of the symplectic groups are denoted 
sp(n; R), sp(n; C), and sp(n). 


Proposition 3.26. The Lie algebra of the Heisenberg group H in Sect. 1.2.6 is the 
space of all matrices of the form 


Oab 
X=1/100c ], (3.10) 
000 


with a,b,c € R. 


Proof. If X is strictly upper triangular, it is easy to verify that X” will be strictly 
upper triangular for all positive integers m. Thus, for X as in (3.10), we will have 
e = I + B with B strictly upper triangular, showing that e¥ € H. Conversely, if 
e™ belongs to H for all real ¢, then all of the entries of e% on or below the diagonal 
are independent of t. Thus, X = d(e*)/dt| _, will be of the form in (3.10). Oo 


We leave it as an exercise to determine the Lie algebras of the Euclidean and 
Poincaré groups. 


Example 3.27. The following elements form a basis for the Lie algebra su(2): 


(i 0). sf Ue \,. (0-1 
B=3(4 5): B= 3(94)) B= 3(( 0} 
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These elements satisfy the commutation relations [E,, E2] = E3, [E2, E3] = Fi, 
and [E3, E1] = E>. The following elements form a basis for the Lie algebra so(3): 


00 0 001 0-10 
FL=|00-1], k= 000], Fa= {1 00], 
01 0 —1 00 0 00 


These elements satisfy the commutation relations [F,, F2] = A, [F2, F3] = Fi, and 
[F3, Fi] = Fo. 


Note that the listed relations completely determine all commutation relations 
among, say, E1, E2, and F3, since by the skew symmetry of the bracket, we must 
have [E), E1] = 0, [E2, E1] = —E3, and so on. Since E1, E2, and E3 satisfy the 
same commutation relations as F4, F2, and F3, the two Lie algebras are isomorphic. 


Proof. Direct calculation from Proposition 3.24. oO 


3.5 Lie Group and Lie Algebra Homomorphisms 


The following theorem tells us that a Lie group homomorphism between two Lie 
groups gives rise in a natural way to a map between the corresponding Lie algebras. 
It will follow (Exercise 8) that isomorphic Lie groups have isomorphic Lie algebras. 


Theorem 3.28. Let G and H be matrix Lie groups, with Lie algebras g and b, 
respectively. Suppose that ® : G —> H is a Lie group homomorphism. Then there 
exists a unique real-linear map ¢ : g — h such that 


B(e*) = e? © (3.11) 


for all X € g. The map ¢ has following additional properties: 


1. ¢ (AXA!) = O(A)G(X)O(A)|, for all X € g, AEG. 


2. (X, YD = [6(X), o(Y)], for all X,Y € g. 
3. $(X) = £0(e%)| _y forall X € g. 


t=0’ 

In practice, given a Lie group homomorphism ®, the way one goes about 
computing ¢ is by using Property 3. In the language of manifolds, Property 3 says 
that ¢ is the derivative (or differential) of ® at the identity. By Point 2, ġ : g — b is 
a Lie algebra homomorphism. Thus, every Lie group homomorphism gives rise to a 
Lie algebra homomorphism. In Chapter 5, we will investigate the reverse question: 
If ¢ is a homomorphism between the Lie algebras of two Lie groups, is there an 
associated Lie group homomorphism ®? 


Proof. The proof is similar to the proof of Theorem 3.20. Since ® is a continuous 
group homomorphism, ®(e) will be a one-parameter subgroup of H, for each 
X € g. Thus, by Theorem 2.14, there is a unique matrix Z such that 
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P(e”) = e7 (3.12) 


for allt € R. We define #(X) = Z and check that @ has the required properties. 
First, by putting ¢ = 1 in (3.12), we see that ®(e*) = e? for all X € g. Next, if 
P(e”) = eZ for all t, then b(e**) = e””, showing that (sX) = sp(X). Using 
the Lie product formula and the continuity of ®, we then compute that 


ett +Y) = o( lim Cg 
m—>oo 
= lim (He!) bem)". 
m—>oo 


Thus, 


eH) fim (ebm m)" — eHO), 


m—>oOo 


Differentiating this result at t = 0 shows that (X +Y) = $(X)+ @(Y). 
We have thus obtained a real-linear map ¢ satisfying (3.11). If there were another 
real-linear map ¢’ with this property, we would have 


eX) = et) = P(e”) 


for all t € R. Differentiating this result at £ = 0 shows that ¢ (X) = ¢’(X). 
We now verify the remaining claimed properties of ġ. For any A € G, we have 


ef P(AXA7!) = ef tAXA~') = D(e!4XA7'), 
Thus, 
a AXAD & @(A)O(C™)O(A) 
= (Aet MMA). 


Differentiating this identity at £ = 0 gives Point 1. 
Meanwhile, for any X and Y in g, we have, as in the proof of Theorem 3.20, 


E 


$ 


t=0 


d 
$ (X.Y) = o( Serre 


= L gfe ye=) 


where we have used the fact that a derivative commutes with a linear transformation. 
Thus, 
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d 
$ ([X, Y]) Foe ewe) 


t=0 


d $ 
— l tO Yet) 
TE $Y Je 


t=0 


II 


lX). 6M), 


establishing Point 2. Finally, since ®(e*) = ef) = e'?), we can compute ¢ (X) 
as in Point 3. oO 


Example 3.29. Let ® : SU(2) —> SO(3) be the homomorphism in Proposi- 
tion 1.19. Then the associated Lie algebra homomorphism ¢ : su(2) — so(3) 
satisfies 


o(E;/)=F;, fj =1,2,3, 


where { E1, E2, £3} and { F, F2, F3} are the bases for Su(2) and so(3), respectively, 
given in Example 3.27. 


Since @ maps a basis for Su(2) to a basis for so(3), we see that @ is a 
Lie algebra isomorphism, even though ©® is not a Lie group isomorphism (since 
ker(®) = {J, —1}). 


Proof. If X is in su(2) and Y is in the space V in (1.14), then 


= [X,Y]. 


d gery 
dt i=0 


= d oxy ok 
dt 


t=0 


Thus, (X) is the linear map of V & R? to itself given by Y > [X,Y]. If, say, 
X = E), then direct computation shows that 


E xi X2 + i3 x, x i 
1 f = : ; 
tx ix —X] xi- ix —X} 


where (x1, x}, x4) = (0, —x3, x2). Since 


0 00 0\ (x 
—x;|={ 00-1] [x |, (3.13) 
x) 01 0/ \x; 


we conclude that ¢ (E1) is the 3x3 matrix appearing on the right-hand side of (3.13), 
which is precisely F;. The computation of @(£2) and ¢(£3) is similar and is left to 
the reader. o 
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Proposition 3.30. Suppose that G, H, and K are matrix Lie groups and 
®: H > KandWV: G —> H are Lie group homomorphisms. Let A : G > K be 
the composition of ® and Y and let $, Y, and i be the Lie algebra maps associated 
to ®, Y, and A, respectively. Then we have 


A= ow. 
Proof. For any X € g, 
A(e*) = GHW(e™)) = Het) = et), 


Thus, A(X) = o(W(X)). o 


Proposition 3.31. Jf ® : G > H is a Lie group homomorphism and ġ : g > b is 
the associated Lie algebra homomorphism, then the kernel of ® is a closed, normal 
subgroup of G and the Lie algebra of the kernel is given by 


Lie(ker(®)) = ker(@). 


Proof. The usual algebraic argument shows that ker(®) is normal subgroup of G. 
Since, also, ® is continuous, ker(®) is closed. If X € ker(@), then 


P(e”) = eH = ], 


for all f € R, showing that X is in the Lie algebra of ker(®). In the other direction, 
if e lies in ker(®) for all ¢ € R, then 


eX) — pe”) =] 
for all t. Differentiating this relation with respect to t at t = 0 gives that (X) = 0, 
showing that X € ker(@). Oo 
Definition 3.32 (The Adjoint Map). Let G be a matrix Lie group, with Lie algebra 
g. Then for each A € G, define a linear map Ady : g —> g by the formula 

Ad4(X) = AXA. 


Proposition 3.33. Let G be a matrix Lie group, with Lie algebra g. Let GL(g) 
denote the group of all invertible linear transformations of g. Then the map A > 
Ada is a homomorphism of G into GL(g). Furthermore, for each A € G, Ada 
satisfies Ad4([X, Y]) = [Ada(X), Ad4(Y)] for all X,Y € g. 


Proof. Easy. Note that Point 1 of Theorem 3.20 guarantees that Ad4(X) is actually 
in g forall X € g. o 


Since g is a real vector space with some dimension k, GL(g) is essentially the 
same as GL(k; R). Thus, we will regard GL(g) as a matrix Lie group. It is easy to 
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show that Ad : G —> GL(g) is continuous, and so is a Lie group homomorphism. 
By Theorem 3.28, there is an associated real linear map X — ady from the Lie 
algebra of G to the Lie algebra of GL(g) (i.e., from g to gl(g)), with the property 
that 


ex = Ady. 
Here, gl(g) is the Lie algebra of GL(g), namely the space of all linear maps of g to 


itself. 


Proposition 3.34. Let G be a matrix Lie group, let g be its Lie algebra, and let 
Ad : G > GL(g) be as in Proposition 3.33. Let ad : g — gl(g) be the associated 
Lie algebra map. Then for all X,Y € g 


adx (Y) = [X,Y]. (3.14) 


The proposition shows that our usage of the notation ady in this section is 
consistent with that in Definition 3.7. 


Proof. By Point 3 of Theorem 3.28, ad can be computed as follows: 


d a Ad 
a = — £ 
E a a =: 
Thus, 

d 1X —1X 

ady(Y) = —eYe = [X,Y], 

dt 1=0 

as claimed. oO 


We have proved, as a consequence of Theorem 3.28 and Proposition 3.34, the 
following result, which we will make use of later. 


Proposition 3.35. For any X in M,(C), let adx : M (C) > M,(C) be given by 
ady Y = [X,Y]. Then for any Y in M,(C), we have 


eXYe* = Ad,x(Y) = e** (Y), 
where 
1 
ery) =¥ + [XY] + 51 YI +-. 


This result can also be proved by direct calculation—see Exercise 14. 
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3.6 The Complexification of a Real Lie Algebra 


In studying the representations of a matrix Lie group G (as we will do in later 
chapters), it is often useful to pass to the Lie algebra g of G, which is, in general, 
only a real Lie algebra. It is then often useful to pass to an associated complex Lie 
algebra, called the complexification of g. 


Definition 3.36. If V is a finite-dimensional real vector space, then the 
complexification of V, denoted Vc, is the space of formal linear combinations 


vı + 1, 


with v1, v2 € V. This becomes a real vector space in the obvious way and becomes 
a complex vector space if we define 


i(vy + iv2) = —v2 + iJ. 


We could more pedantically define Vc to be the space of ordered pairs (v1, v2) 
with v1, v2 € V, but this is notationally cumbersome. It is straightforward to verify 
that the above definition really makes Vc into a complex vector space. We will 
regard V as a real subspace of Vc in the obvious way. 


Proposition 3.37. Let g be a finite-dimensional real Lie algebra and gc its 
complexification. Then the bracket operation on g has a unique extension to gc 
that makes gc into a complex Lie algebra. The complex Lie algebra gc is called the 
complexification of the real Lie algebra g. 


Proof. The uniqueness of the extension is obvious, since if the bracket operation on 
Gc is to be bilinear, then it must be given by 


[Xi + iX2, Yi + iY2] = ([X1, Yi] — [X2, Y2)) + i (M1, Y2] + [X2, Y1). (8.15) 
To show existence, we must now check that (3.15) is really bilinear and skew 
symmetric and that it satisfies the Jacobi identity. It is clear that (3.15) is real 
bilinear, and skew-symmetric. The skew symmetry means that if (3.15) is complex 
linear in the first factor, it is also complex linear in the second factor. Thus, we need 
only show that 

[i (Xi + iX2), Yı + iY2] = i [X1 + iX2, Yı + iY2]. (3.16) 
The left-hand side of (3.16) is 


[X2 + iX1, Yı + iY2] = (— [X2, Y1] — [X1, Y2]) + i ((X1, Yı] — [X2, Y2)), 
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whereas the right-hand side of (3.16) is 


i {([X1, Yi] — [Xo, Y2]) + i ([X2, Y1] + [X1, Y2])} 
= (—[%2, Yı] — [X1, Y2)) + i ((X1, Yı] — [X2, Y2]) , 


and, indeed, these expressions are equal. 
It remains to check the Jacobi identity. Of course, the Jacobi identity holds if 
X,Y, and Z are in g. Furthermore, for all X, Y, Z € gc, the expression 


[X, [Y, Z]] + [¥,[Z, X]] + [Z. [X. Y] 


is complex-linear in X with Y and Z fixed. Thus, the Jacobi identity continues to 
hold if X is in gc and Y and Z are in g. The same argument then shows that the 
Jacobi identity holds when X and Y in gc and Z is in g. Applying this argument 
one more time establishes the Jacobi identity for gc in general. o 


Proposition 3.38. Suppose that g C M,(C) is a real Lie algebra and that for all 
nonzero X in g, the element iX is not in g. Then the “abstract” complexification gc 
of g in Definition 3.36 is isomorphic to the set of matrices in M, (C) that can be 
expressed in the form X + iY with X and Y ing. 


Proof. Consider the map from gc into M, (C) sending the formal linear combina- 
tion X + iY to the linear combination X + iY of matrices. This map is easily seen 
to be a complex Lie algebra homomorphism. If g satisfies the assumption in the 
statement of the proposition, this map is also injective and thus an isomorphism of 
gc with g + ig C M, (©). o 


Using the proposition, we easily obtain the following list of isomorphisms: 


gl(n;R)c = gl(n; ©), 
u(n)e = gl(n;©), 
su(n)e = sl(n;©), 
sl(n;R)c & sl(n; ©), (8.17) 
so(n)e & so(n; C), 
sp(n; R)c = sp(n; ©), 
sp(n)c = sp(n; ©). 


Let us verify just one example, that of u(n). If X* = —X, then (iX)* = iX. Thus, X 
and X * cannot both be in u(n) unless X is zero. Furthermore, every X in M, (C) can 
be expressed as X = X,+iX2, where X; = (X — X*)/2 and X2 = (X + X*)/(2i) 
are both in u(n). This shows that u(n)c & gl(n; ©). 

Although both su(2)c¢ and sl(2; R)c are isomorphic to sl(2; C), the Lie algebra 
su(2) is not isomorphic to sl(2; R). See Exercise 11. 


3.7 The Exponential Map 67 


Proposition 3.39. Let g be a real Lie algebra, gc its complexification, and h an 
arbitrary complex Lie algebra. Then every real Lie algebra homomorphism of g 
into h extends uniquely to a complex Lie algebra homomorphism of gc into b. 


This result is the universal property of the complexification of a real Lie 
algebra. 


Proof. The unique extension is given by m (X +iY) = 1(X)+izx(Y) forall X,Y € 
g. It is easy to check that this map is, indeed, a homomorphism of complex Lie 
algebras. Oo 


3.7 The Exponential Map 


Definition 3.40. If G is a matrix Lie group with Lie algebra g, then the exponential 
map for G is the map 


exp:g—>G. 


That is to say, the exponential map for G is the matrix exponential restricted 
to the Lie algebra g of G. We have shown (Theorem 2.10) that every matrix in 
GL(n; C) is the exponential of some n xn matrix. Nevertheless, if G C GL(n; C) is 
a closed subgroup, there may exist A in G such that there is no X in the Lie algebra 
g of G with exp X = A. 


Example 3.41. There does not exist a matrix X € sl(2;C) with 


ae i 1 
e -( = (3.18) 


even though the matrix on the right-hand side of (3.18) is in SL(2; C). 


Proof. If X € sl(2;C) has distinct eigenvalues, then X is diagonalizable and e* 
will also be diagonalizable, unlike the matrix on the right-hand side of (3.18). If 
X € sl(2; C) has a repeated eigenvalue, this eigenvalue must be 0 or the trace of X 
would not be zero. Thus, there is a nonzero vector v with Xv = 0, from which it 
follows that e*v = ev = v. We conclude that e* has 1 as an eigenvalue, unlike 
the matrix on the right-hand side of (3.18). oO 


We see, then, that the exponential map for a matrix Lie group G does not 
necessarily map g onto G. Furthermore, the exponential map may not be one-to- 
one on g, as may be seen, for example, from the case g = su(2). Nevertheless, it 
provides a crucial mechanism for passing information between the group and the 
Lie algebra. Indeed, we will see (Corollary 3.44) that the exponential map is locally 
one-to-one and onto, a result that will be essential later. 
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Theorem 3.42. For 0 < € < log2, let Us = {X € M,(C)|||X|| < e} and let 
V = exp(U,). Suppose G C GL(n;C) is a matrix Lie group with Lie algebra g. 
Then there exists € € (0,log2) such that for all A € V,, A is in G if and only if 
log A is in g. 


The condition £ < log2 guarantees (Theorem 2.8) that for all X € U;, log(e* ) 
is defined and equal to X. Note that if X = log A is in g, then A = e% is in G. 
Thus, the content of the theorem is that for some ¢, having A in V; N G implies that 
log A must be in g. See Figure 3.1. 

We begin with a lemma. 


Lemma 3.43. Suppose Bm are elements of G and that By, > I. Let Ym = log Bm, 
which is defined for all sufficiently large m. Suppose that Ym is nonzero for all m 
and that Ym/ \|Ym|| > Y € M,(C). Then Y is in g. 


Proof. For any t € R, we have (t/ ||Yin||) Ym —> tY. Note that since Bm > I, we 
have ||Y,,|| — 0. Thus, we can find integers km such that km || Yn || —> t. We have, 
then, 


Yin 
ofmYm = exp [æn RAD) IY, | > e”. 


Fig. 3.1 If A € V, belongs to G, then log A belongs to g 
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Fig. 3.2 The points kmY,m are converging to tY 


However, 
ekmYm = (e%m km = (Bn) € G 


and G is closed, and we conclude that e” € G. This shows that Y € g. (See 
Figure 3.2.) o 


Proof of Theorem 3.42. Let us think of M, (C) as C"? = R”? and let D denote 
the orthogonal complement of g with respect to the usual inner product on R?” . 
Consider the map ® : M, (C) —> M,,(C) given by 


O(Z) = e*e’," 


where Z = X +Y with X € gand Y € D. Since (Proposition 2.16) the exponential 
is continuously differentiable, the map © is also continuously differentiable, and we 
may compute that 


= X, 
t=0 


d 
— O(1X 
Wee 9) 


=. 
t=0 


d 
— (0, tY 
geen 
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This calculation shows that the derivative of ® at the point 0 € R2"” is the 
identity. (Recall that the derivative at a point of a function from R2" to itself is 
a linear map of R”? to itself.) Since the derivative of ® at the origin is invertible, 
the inverse function theorem says that ® has a continuous local inverse, defined in 
a neighborhood of T. 

We need to prove that for some e, if A € V, N G, then log A € g. If this were not 
the case, we could find a sequence An in G such that A,, — I asm — oo and such 
that for all m, log Am ¢ g. Using the local inverse of the map ®, we can write A,m 
(for all sufficiently large m) as 


Am = eXm erm | Xm Eg, Yin € D, 


with Xm and Y,, tending to zero as m tends to infinity. We must have Y,,, Æ 0, since 


otherwise we would have log Am = Xm € g. Since eX and Ám are in G, we see 
that 

Bm =e *" Am = e™ 
isinG. 


Since the unit sphere in D is compact, we can choose a subsequence of the Y,,,’s 
(still called Y,,,) so that Yin/ || Yin|| converges to some Y € D, with ||Y || = 1. Then, 
by the lemma, Y € g. This is a contradiction, because D is the orthogonal comple- 
ment of g. Thus, there must be some e such that log A € g for all A in VO G. O 


3.8 Consequences of Theorem 3.42 


In this section, we derive several consequences of the main result of the last section, 
Theorem 3.42. 


Corollary 3.44. If G is a matrix Lie group with Lie algebra g, there exists a 
neighborhood U of 0 in g and a neighborhood V of I in G such that the exponential 
map takes U homeomorphically onto V. 


Proof. Let € be such that Theorem 3.42 holds and set U = U, N g and V = 
V N G. The theorem implies that exp takes U onto V. Furthermore, exp is a 
homeomorphism of U onto V, since there is a continuous inverse map, namely, 
the restriction of the matrix logarithm to V. oO 


Corollary 3.45. Let G be a matrix Lie group with Lie algebra g and let k be the 
dimension of g as a real vector space. Then G is a smooth embedded submanifold 
of M,,(C) of dimension k and hence a Lie group. 


It follows from the corollary that G is locally path connected: every point in 
G has a neighborhood U that is homeomorphic to a ball in R* and hence path 
connected. It then follows that G is connected (in the usual topological sense) if and 
only if it is path connected. (See, for example, Proposition 3.4.25 of [Run].) 
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Proof. Let € € (0,log 2) be such that Theorem 3.42 holds. Then for any Ao € G, 
consider the neighborhood Ao V, of Ap in M,,(C). Note that A € AgV, if and only 
if A'A € V,. Define a local coordinate system on Ao V, by writing each A € Ao V; 
as A = Aoe¥, for X € U, C M, (C). It follows from Theorem 3.42 that (for 
A € AoV.) A € G if and only if X € g. Thus, in this local coordinate system 
defined near Ao, the group G looks like the subspace g of M, (C). Since we can find 
such local coordinates near any point Ag in G, we conclude that G is an embedded 
submanifold of M, (©). 

Now, the operation of matrix multiplication is clearly smooth. Furthermore, by 
the formula for the inverse of a matrix in terms of cofactors, the map A => A! is 
also smooth on GL(n; C). The restrictions of these maps to G are then also smooth, 
showing that G is a Lie group. oO 


Corollary 3.46. Suppose G C GL(n; C) is a matrix Lie group with Lie algebra g. 
Then a matrix X is in g if and only if there exists a smooth curve y in M,(C) with 
y(t) € G for all t and such that y(0) = I and dy/dt\;=0 = X. Thus, g is the 
tangent space at the identity to G. 


This result is illustrated in Figure 3.1. 


Proof. If X is in g, then we may take y(t) = e™ and then y(0) = J and 
dy/dt\;=0 = X. In the other direction, suppose that y(t) is a smooth curve in G 
with y(0) = J. For all sufficiently small t, we can write y(t) = e*), where ô is a 
smooth curve in g. Now, the derivative of 5(t) at t = 0 is the same as the derivative 
of t ++ t6’(0) at t = 0. Thus, by the chain rule, we have 


d 
— “ eth) 


d 
$ ae e a) 
y (0) = —e P7 


A = §'(0). 


t=0 


t=0 


Since 6(t) belongs to g for all sufficiently small t, we conclude (as in the proof of 
Theorem 3.20) that 5’(0) = y’(0) belongs to g. Oo 


Corollary 3.47. If G is a connected matrix Lie group, every element A of G can be 
written in the form 


+s eXm (3.19) 


for some X\, X2,...,Xm ing. 


Even if G is connected, it is in general not possible to write every A € G as 
single exponential, A = exp X, with X € g. (See Example 3.41.) We begin with a 
simple analytic lemma. 


Lemma 3.48. Suppose A : [a,b] — Gl(n;C) is a continuous map. Then for all 
€ > 0 there exists 6 > 0 such that if s,t € |a, b] satisfy |s — t| < ô, then 


|AW) AMT -T] < e. 
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Proof. We note that 


| AWA! -7] = |(AG)- 4M) AO | 
< ||A(@)-— AM] [AO]. (3.20) 


Since [a, b] is compact and the map t > | A(t)! | is continuous, there is a constant 
C such that |4M7 | < C for allt € [a,b]. Furthermore, since [a, b] is compact, 
Theorem 4.19 in [Rud 1] tells us that the map A is actually uniformly continuous on 
la, b]. Thus, for any € > 0, there exists 6 > 0 such that when |s — t| < 6, we have 
|| A(s) — A(t) || < e/C. Thus, in light of (3.20), we have the desired ô. Oo 


Proof of Corollary 3.47. Let V; be as in Theorem 3.42. For any A € G, choose a 
continuous path A : [0,1] — G with A(0) = J and A(1) = A. By Lemma 3.48, 
we can find some 6 > 0 such that if |s—t| < ô, then A(s)A(t)"! € V. Now 
divide [0, 1] into m pieces, where 1/m < ô. Then for j = 1,...,m, we see that 
A((j — 1)/m)~!A(j/m) belongs to V,, so that 


A((j — 1)/m)7' A(j/m) = e”) 
for some elements X1,..., Xm of g. Thus, 


A = A(0)'A(1) 
= AQTA /m) A/m 'A(2/m)--- A((m — 1)/m)“'A(1) 


Xi o%2... 


=e"le Xm 


e 


as claimed. oO 


Corollary 3.49. Let G and H be matrix Lie groups with Lie algebras g and 
h, respectively, and assume G is connected. Suppose ®, and ®2 are Lie group 
homomorphisms of G into H and that ¢, and $2 be the associated Lie algebra 
homomorphisms of g into h. Then if p1 = $2, we have ®ı = ®p. 


Proof. Let g be any element of G. Since G is connected, Corollary 3.47 tells us that 
every element of G can be written as e*'e*? ---e*, with X; € g. Thus, 


®,(e*! Seite ex) = e% (X) Faa eb! (Xm) 
= eh... phon) 
= P(e”! suave exm), 


as claimed. oO 


Corollary 3.50. Every continuous homomorphism between two matrix Lie groups 
is smooth. 
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Proof. For all g € G, we write nearby elements h € G ash = ge*, with X € g, so 
that 


D(h) = P(g) P(e*) = O(g)e?™. 


This relation says that in the exponential coordinates near g, the map ® is a 
composition of a linear map, the exponential map, and multiplication on the left 
by ®(g), all of which are smooth. This shows that ® is smooth near g. o 


Corollary 3.51. If G is a connected matrix Lie group and the Lie algebra g of G is 
commutative, then G is commutative. 


This result is a partial converse to Proposition 3.22. 


Proof. Since g is commutative, any two elements of G, when written as in 
Corollary 3.47, will commute. E 


Corollary 3.52. IfG C M,(C) is a matrix Lie group, the identity component Go of 
G is a closed subgroup of G\(n; C) and thus a matrix Lie group. Furthermore, the 
Lie algebra of Go is the same as the Lie algebra of G. 


Proof. Suppose that (Am) is a sequence in Go converging to some A € GL(n; C). 
Then certainly A € G, since G is closed. Furthermore, Am A`! lies in G for all 
mand A,,A~! > I asm > oo. If m is large enough, Theorem 3.42 tells us that 
A, A! = e* for some X € g, so that A = e-* Am. Since Am € Go, there is a path 
joining J to Ám in G. But we can then join Am to e*A, =A by the path e~* Am, 
0 < t < 1. By combining these two paths, we can join J to A in G, showing that A 
belongs to Go. 

Now, since Go C G, the Lie algebra of Go is contained in the Lie algebra of G. 
In the other direction, if e™ lies in G for all t, then it actually lies in Go, since any 
point e* on the curve e” can be connected to J in G, using the curve e” itself, for 
0<t<%. oO 


3.9 Exercises 


1. (a) If gis a Lie algebra, show that the center of g is an ideal in g. 

(b) If g and h are Lie algebras and ¢ : g — b is a Lie algebra homomorphism, 
show that the kernel of ¢ is an ideal in g. 

2. Classify up to isomorphism all one-dimensional and two-dimensional real Lie 
algebras. There is one isomorphism class of one-dimensional algebras and two 
isomorphism classes of two-dimensional algebras. 

3. Let g denote the space of n x n upper triangular matrices with zeros on the 
diagonal. Show that g is a nilpotent Lie algebra under the bracket given by 
[X, Y] = XY — YX. 

4. Give an example of a matrix Lie group G and a matrix X such that e* € G, 
but X ¢ g. 
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10. 


11. 


12. 


13. 


14. 


15. 
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. If Gi C GL(m; C) and Gz C GL(n2; C) are matrix Lie groups and G; x G» is 
their direct product (regarded as a subgroup of GL(m, + n2; C) in the obvious 
way), show that the Lie algebra of G, x Gə is isomorphic to gı ® g2. 

. Let G and H be matrix Lie groups with H C G, so that the Lie algebra h of H 
is a subalgebra of the Lie algebra g of G. 


(a) Show that if H is a normal subgroup of G, then b is an ideal in g. 
(b) Show that if G and H are connected and h is an ideal in g, then H is a 
normal subgroup of G. 


. Suppose G C GL(n; C) is a matrix Lie group with Lie algebra g. Suppose that 
Aisin G and that || A — T || < 1, so that the power series for log A is convergent. 
Is it necessarily the case that log A is in g? Prove or give a counterexample. 

. Show that two isomorphic matrix Lie groups have isomorphic Lie algebras. 

. Write out explicitly the general form of a 4 x 4 real matrix in so(3; 1) (see 

Proposition 3.25). 

Show that there is an invertible linear map œ : su(2) —> R? such that 

(X, Y]) = (X) x (Y) for all X,Y € su(2), where x denotes the cross 

product (vector product) on R3. 

Show that su(2) and sl(2;R) are not isomorphic Lie algebras, even though 

su(2)c = sl(2;R)c = sl(2; ©). 

Show the groups SU(2) and SO(3) are not isomorphic, even though the 

associated Lie algebras su(2) and so(3) are isomorphic. 

Let G be a matrix Lie group and let g be its Lie algebra. For each A € G, show 

that Ad; is a Lie algebra automorphism of g. 

Let X and Y ben x n matrices. Show by induction that 


(adx)” (Y) = > () xk ¥exy , 


k=0 
where 


(adx)” (Y) = [X,---[X, [X, Y]] +]. 


m 


Now, show by direct computation that 
e (Y) = Adex (Y) = e* Ye. 


Assume it is legal to multiply power series term by term. (This result was 
obtained indirectly in Proposition 3.35.) 

Hint: Use Pascal’s triangle. 

If G is a matrix Lie group, a component of G is the collection of all points that 
can be connected to a fixed A € G by a continuous path in G. Show that if G 
is compact, G has only finitely many components. 
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16. 


17. 


18. 


19. 


20. 


21. 
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Hint: Suppose G had infinitely many components and consider a sequence 
A; with each element of the sequence in a different component. Extract a 
convergent subsequence and B = A;, and consider B, "Bray. 

Suppose that G is a connected, commutative matrix Lie group with Lie 
algebra g. Show that the exponential map for G maps g onto G. 

Suppose G is a connected matrix Lie group with Lie algebra g and that A is an 
element of G. Show that A belongs to the center of G if and only if Ad4(X) = 
X forall X € g. 

Show that the exponential map from the Lie algebra of the Heisenberg group to 
the Heisenberg group is one-to-one and onto. 

Show that the exponential map from u(n) to U(n) is onto, but not one-to-one. 
Hint: Every unitary matrix has an orthonormal basis of eigenvectors. 

Suppose X is a 2 x 2 real matrix with trace zero, and assume X has a nonreal 
eigenvalue. 


(a) Show that the eigenvalues of X must be of the form ia, —ia with a a nonzero 
real number. 

(b) Show that the corresponding eigenvectors of X can be chosen to be 
complex conjugates of each other, say, v and v. 

(c) Show that there exists an invertible real matrix C such that 


Hint: Use v and ọ to construct a real basis for R?, and determine the matrix X 
in this basis. 

Suppose A is a 2 x 2 real matrix with determinant one, and assume A has a 
nonreal eigenvalue. Show that there exists a real number 0 that is not an integer 
multiple of x and an invertible real matrix C such that 


cos @ sind -1 
A= : 
$ E aye 


Show that the image of the exponential map for SL(2; R) consists of precisely 
those matrices A € SL(2;R) such that trace(A) > —2, together with the 
matrix —I (which has trace —2). To do this, consider the possibilities for the 
eigenvalues of a matrix in the Lie algebra sI(2; R) and in the group SL(2; R). In 
the Lie algebra, show that the eigenvalues are of the form (A, —A) or (iA, —iA), 
with A real. In the group, show that the eigenvalues are of the form (a, 1/a) or 
(—a, —1/a), with a real and positive, or of the form (e’’, e~'°), with 0 real. The 
case of a repeated eigenvalue ((0, 0) in the Lie algebra and (1, 1) or (—1, —-1) 
in the group) will have to be treated separately using the Jordan canonical form 
(Sect. A.4). 
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Hint: You may assume that if a real matrix X has real eigenvalues, then X is 
similar over the reals to its Jordan canonical form. Then use the two previous 
exercises. 


Chapter 4 
Basic Representation Theory 


4.1 Representations 


If V is a finite-dimensional real or complex vector space, let GL(V) denote the 
group of invertible linear transformations of V. If we choose a basis for V, we can 
identify GL(V) with GL(m; R) or GL(n; C). Any such identification gives rise to a 
topology on GL(V), which is easily seen to be independent of the choice of basis. 
With this discussion in mind, we think of GL(V) as a matrix Lie group. Similarly, 
we let gl(V) = End(V) denote the space of all linear operators from V to itself, 
which forms a Lie algebra under the bracket [X, Y] = XY — YX. 


Definition 4.1. Let G be a matrix Lie group. A finite-dimensional complex 
representation of G is a Lie group homomorphism 


Il: G > GL(V), 


where V is a finite-dimensional complex vector space (with dim(V) > 1). A finite- 
dimensional real representation of G is a Lie group homomorphism II of G into 
GL(V), where V is a finite-dimensional real vector space. 

If g is a real or complex Lie algebra, then a finite-dimensional complex 
representation of g is a Lie algebra homomorphism z of g into gl(V), where 
V is a finite-dimensional complex vector space. If g is a real Lie algebra, then a 
finite-dimensional real representation of g is a Lie algebra homomorphism z of 
ginto gl(V). 

If II or z is a one-to-one homomorphism, the representation is called faithful. 


One should think of a representation as a linear action of a group or Lie algebra 
on a vector space (since, say, to every g € G, there is associated an operator I1(g), 
which acts on the vector space V). If the homomorphism II : G + GL(V) is fixed, 
we will occasionally use the alternative notation 
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in place II(g)v. We will often use terminology such as “Let TI be a representation 
of G acting on the space V.” 

If a representation II is a faithful representation of a matrix Lie group G, then 
{TI(A) |A € G} is a group of matrices that is isomorphic to the original group G. 
Thus, II allows us to represent G as a group of matrices. This is the motivation 
for the term “representation.” (Of course, we still call II a representation even if 
it is not faithful.) Despite the origin of the term, the goal of representation theory 
is not simply to represent a group as a group of matrices. After all, the groups we 
study in this book are already matrix groups! Rather, the goal is to determine (up to 
isomorphism) all the ways a fixed group can act as a group of matrices. 

Linear actions of groups on vector spaces arise naturally in many branches of 
both mathematics and physics. A typical example would be a linear differential 
equation in three-dimensional space which has rotational symmetry, such as the 
equations that describe the energy states of a hydrogen atom in quantum mechanics. 
Since the equation is rotationally invariant, the space of solutions is invariant 
under rotations and thus constitutes a representation the rotation group SO(3). The 
representation theory of SO(3) (or of its Lie algebra) is very helpful in narrowing 
down what the space of solutions can be. See, for example, Chapter 18 in [Hall]. 


Definition 4.2. Let IT be a finite-dimensional real or complex representation of a 
matrix Lie group G, acting on a space V. A subspace W of V is called invariant 
if I(A)w € W for all w € W and all A € G. An invariant subspace W is called 
nontrivial if W ~ {0} and W Æ V. A representation with no nontrivial invariant 
subspaces is called irreducible. 

The terms invariant, nontrivial, and irreducible are defined analogously for 
representations of Lie algebras. 


Even if g is a real Lie algebra, we will consider mainly complex representations of 
g. It should be emphasized that if we are speaking about complex representations 
of a real Lie algebra g acting on a complex subspace V, an invariant subspace W is 
required to be a complex subspace of V. 


Definition 4.3. Let G be a matrix Lie group, let TI be a representation of G acting 
on the space V, and let & be a representation of G acting on the space W. A linear 
map @: V —> W is called an intertwining map of representations if 


p(II(A)v) = B(A)P(v) 


for all A € G and all v € V. The analogous property defines intertwining maps of 
representations of a Lie algebra. 

If ¢ is an intertwining map of representations and, in addition, @ is invertible, 
then ġ is said to be an isomorphism of representations. If there exists an 
isomorphism between V and W, then the representations are said to be isomorphic. 
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If we use the “action” notation of (4.1), the defining property of an intertwining 
map may be written as 


g(A-v) =A-(v) 


for all A € G and v € V. That is to say, @ should commute with the action of G. 
A typical problem in representation theory is to determine, up to isomorphism, all 
of the irreducible representations of a particular group or Lie algebra. In Sect. 4.6, 
we will determine all the finite-dimensional complex irreducible representations of 
the Lie algebra sl(2; ©). 

After identifying GL(V) with GL(n; R) or GL(n; C), Theorem 3.28 has the 
following consequence. 


Proposition 4.4. Let G be a matrix Lie group with Lie algebra g and let II be a 
(finite-dimensional real or complex) representation of G, acting on the space V. 
Then there is a unique representation x of g acting on the same space such that 


T(e*) = e% 


forall X € g. The representation x can be computed as 


d 
n(X) = qe”) 
t=0 


and satisfies 
m(AXA~') = I (A)x (X) (4)! 


forall X € gandall A € G. 


Given a matrix Lie group G with Lie algebra g, we may ask whether every 
representation z of g comes from a representation IT of G. As it turns out, this is 
not true in general, but is true if G is simply connected. See Sect. 4.7 for examples 
of this phenomenon and Sect. 5.7 for the general result. 


Proposition 4.5. 


1. Let G be a connected matrix Lie group with Lie algebra g. Let II be a 
representation of G and a the associated representation of g. Then II is 
irreducible if and only if x is irreducible. 

2. Let G be a connected matrix Lie group, let TI; and Il, be representations of G, 
and let mı and m, be the associated Lie algebra representations. Then m and m 
are isomorphic if and only if Il; and Il, are isomorphic. 


Proof. For Point 1, suppose first that II is irreducible. We then want to show that 
x is irreducible. So, let W be a subspace of V that is invariant under (X) for all 
X € g. We want to show that W is either {0} or V. Now, suppose A is an element 


80 4 Basic Representation Theory 


of G. Since G is assumed connected, Corollary 3.47 tells us that A can be written 
as A = e*!..-e%™ for some X),..., Xm in g. Since W is invariant under 1(X;) 
it will also be invariant under exp(z(X;)) = I + (Xj) + 2(X;)?/2 +--+ and, 
hence, under 


TI(A) = T(e*! seer) = TI(e*!)--- II (ež) 


— e"n) ERN e” Xm), 
Since T is irreducible and W is invariant under each TI(A), W must be either {0} 
or V. This shows that v is irreducible. 

In the other direction, assume that z is irreducible and that W is an invariant 
subspace for II. Then W is invariant under II(exp¢X) for all X € g and, hence, 
under 


d 
(X) = qe”) 


t=0 


Thus, since v is irreducible, W is {0} or V, and we conclude that TI is irreducible. 
This establishes Point | of the proposition. 

Point 2 of the proposition is similar and is left as an exercise to the reader 
(Exercise 1). oO 


Proposition 4.6. Let g be a real Lie algebra and gc its complexification. Then 
every finite-dimensional complex representation x of g has a unique extension to a 
complex-linear representation of gc, also denoted x. Furthermore, x is irreducible 
as a representation of gc if and only if it is irreducible as a representation of g. 


Of course, the extension of x to gc is given by n (X + iY) = 1(X) +ixn(Y) for 
all X,Y € g. 


Proof. The existence and uniqueness of the extension follow from Proposition 3.39. 
The claim about irreducibility holds because a complex subspace W of V is 
invariant under 7(X + iY), with X and Y in g, if and only if it is invariant under 
the operators 2(X) and 2(Y). Thus, the representation of g and its extension to gc 
have precisely the same invariant subspaces. Oo 


Definition 4.7. If V is a finite-dimensional inner product space and G is a matrix 
Lie group, a representation II : G — GL(V) is unitary if II(A) is a unitary 
operator on V for every A € G. 


Proposition 4.8. Suppose G is a matrix Lie group with Lie algebra g. Suppose V 
is a finite-dimensional inner product space, TI is a representation of G acting on V, 
and x is the associated representation of g. If TI is unitary, then n (X) is skew self- 
adjoint for all X € g. Conversely, if G is connected and n(X) is skew self-adjoint 
for all X € g, then II is unitary. 
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In a slight abuse of notation, we will say that a representation z of a real Lie 
algebra g acting on a finite-dimensional inner product space is unitary if 7(X) is 
skew self-adjoint for all X € g. 


Proof. The proof is similar to the computation of the Lie algebra of the unitary 
group U(7). If T is unitary, then for all X € g we have 


(nl )\* = TI (e*)* = T(e*)! = eR) | te R, 
so that e700" = eA t™(X), Differentiating this relation with respect to t at t = 0 
gives 1(X)* = —z(X). In the other direction, if 7(X)* = —z(X), then the 
above calculation shows that TI(e%*) = e‘"® is unitary. If G is connected, then 


by Corollary 3.47, every A € G is a product of exponentials, showing that IT(A) is 
unitary. o 


4.2 Examples of Representations 


A matrix Lie group G is, by definition, a subset of some GL(n; C). The inclusion 
map of G into GL(n; C) (i.e., the map II(A) = A) is a representation of G, called 
the standard representation of G. If G happens to be contained in GL(n;R) C 
GL(n; C), then can also we can think of the standard representation as a real 
representation. Thus, for example, the standard representation of SO(3) is the one in 
which SO(3) acts in the usual way on R? and the standard representation of SU(2) 
is the one in which SU(2) acts on C? in the usual way. Similarly, if g C M,(C) isa 
Lie algebra of matrices, the map z(X) = X is called the standard representation 
of g. 

Consider the one-dimensional complex vector space C. For any matrix Lie group 
G, we can define the trivial representation, I : G + GL(1; C), by the formula 


TW(A) =T 


for all A € G. Of course, this is an irreducible representation, since C has no 
nontrivial subspaces, let alone nontrivial invariant subspaces. If g is a Lie algebra, 
we can also define the trivial representation of g, x : g > gl(1;C), by 


m(X) =0 


for all X € g. This is an irreducible representation. 
Recall the adjoint map of a group or Lie algebra, described in Definitions 3.32 
and 3.7. 


Definition 4.9. If G is a matrix Lie group with Lie algebra g, the adjoint represen- 
tation of G is the map Ad : G + GL(g) given by A +> Ady. Similarly, the adjoint 


82 4 Basic Representation Theory 


representation of a finite-dimensional Lie algebra g is the map ad : g — gl(g) 
given by X + ady. 


If G is a matrix Lie group with Lie algebra g, then by Proposition 3.34, the Lie 
algebra representation associated to the adjoint representation of G is the adjoint 
representation of g. Note that in the case of SO(3), the standard representation and 
the adjoint representation are both three-dimensional real representations. In fact, 
these two representations are isomorphic (Exercise 2). 


Example 4.10. Let Vm denote the space of homogeneous polynomials of degree 
m in two complex variables. For each U € SU(2), define a linear transformation 
TIm (U) on the space Vm by the formula 


nUf = AUT), ze. (4.2) 


Then T, is a representation of SU(2). 


Elements of V, have the form 
fz) = ao” + ay Zr zy + ager 72S es + am (4.3) 


with z1,z2 € C and the a;’s arbitrary complex constants, from which we see that 
dim(V,,) = m + 1. Explicitly, if f is as in (4.3), then 


m 


[Tmn() fl 22) = $ a (Urza + Ugo)" *(Ug'a + Ug 2). 
k=0 


By expanding out the right-hand side of this formula, we see that I1,,(U) f is again 
a homogeneous polynomial of degree m. Thus, I1,,(U) actually maps Vm into Vin. 
To see that I is actually a representation, compute that 


Tn (UD [Tm (U2) FI = [Tm (U2) FIUT) = fz UT) 
= IMn (U: U2) f(z). 


The inverse on the right-hand side of (4.2) is necessary in order to make II, a 
representation. We will see in Proposition 4.11 that each II, is irreducible and 
we will see in Sect. 4.6 that every finite-dimensional irreducible representation of 
SU(2) is isomorphic to one (and only one) of the IT,,’s. (Of course, no two of the 
I1,,’s are isomorphic, since they do not even have the same dimension.) 

The associated representation mm of Su(2) can be computed as 


d 
(1m(X) f(z) = qi ee 


t=0 
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Now, let z(t) = (z(t), z2(t)) be the curve in C? defined as z(t) = e~z. By the 
chain rule, we have 


d of d 
rian se ee) 
azı dt t=0 Oz dt t=0 
Since dz/dt|,-.9 = —Xz, so we obtain 
a 0 
Tm(X)f = FE xa + X1222) — oF Anz + X22). (4.4) 
£1 2 


We may then take the unique complex-linear extension of x to sl(2;C) = su(2)c, 
as in Proposition 3.39. This extension is given by the same formula, but with X € 
sl(2; C). 

If X, Y, and H are the following basis elements for sl(2; C): 


1 0 01 00 
meha) x= loo): Y= (a): 


then applying formula (4.4) gives 


0 ð 
Tm (H) = =z 5 taz 
0 
m(X — TKD 
Im (X) 27a 
0 
m Y camden) Gras 
Tm(Y) Az 


Applying these operators to a basis element z/’~ =k £ for Vm gives 


Tm (H(z) = (—m + 2k) z3 
Am (X) (zi ngs = —(m — k) a —k—1 om, 


Am VAEA) = ka y. (4.5) 


Thus, z} =k zÆ% i is an eigenvector for (7) with eigenvalue —m + 2k, while nm(X) 
and Tm Y ) have the effect of shifting the exponent k of z2 up or down by one. Note 
that since 7, (X) increases the value of k, this operator increases the eigenvalue of 
Im(H) by 2, whereas 7, (Y ) decreases the eigenvalue of (1) by 2. 


Fig. 4.1 The black dots nel yi 
indicate the nonzero terms for 
a vector w in the space V6. 


Applying ¢(X)* to w gives a O o hg 9 (J (J O 
nonzero multiple of z$ £ zz Ags 22 aa 2123 a 
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Proposition 4.11. For each m > 0, the representation Tm is irreducible. 


Proof. It suffices to show that every nonzero invariant subspace of Vm is equal to 
Vin. SO, let W be such a space and let w be a nonzero element of W. Then w can be 
written in the form 


w = ao! +a ley + ao 7G He + Amz 


with at least one of the a;’s being nonzero. Let kọ be the smallest value of k for 
which a, 4 0 and consider 


Tn (XY w. 


(See Figure 4.1.) 

Since each application of zm (X) raises the power of z2 by 1, m (X)”—*° will kill 
all the terms in w except ine zo term. On the other hand, since 7(X)(z”* 4) 
is zero only if k = m, we see that zm (X yn—kow is a nonzero multiple of z3. Since 
W is assumed invariant, W must contain this multiple of z7 and so also z% itself. 
Now, for 0 < k < m, it follows from (4.5) that 7,(Y ra is a nonzero multiple 
of ao Therefore, W must also contain aa for all 0 < k < m. Since these 


elements form a basis for Vn, we see that W = Vn, as desired. oO 


4.3 New Representations from Old 


One way of generating representations is to take some representations one knows 
and combine them in some fashion. In this section, we will consider three 
standard methods of obtaining new representations from old, namely direct sums 
of representations, tensor products of representations, and dual representations. 


4.3.1 Direct Sums 


Definition 4.12. Let G be a matrix Lie group and let H1, H2,..., Im be repre- 
sentations of G acting on vector spaces Vi, V2,..., Vm. Then the direct sum of 
Ii, M2,..., Um is a representation I ® --- ® Im of G acting on the space 
Vi ®--- ® Vn, defined by 


M: p-p Mn (A)] (vı, as Um) = (11, (A)u;, eee) TT n(A) Um) 


forall AE G. 
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Similarly, if g is a Lie algebra, and 71, 72, .. . , Xm are representations of g acting 
on Vi, V2,..., Vin, then we define the direct sum of 71, 72,..., Zm, acting on V; ® 
++ ® Vin by 


[m D- Tm(X)] (U1... Um) = (m1 (X), ..., Hm(X) Um) 


for all X € g. 


It is straightforward to check that, say, II; ® --- ® I, is really a representation 
of G. 


4.3.2 Tensor Products 


Let U and V be finite-dimensional real or complex vector spaces. We wish to define 
the tensor product of U and V, which will be a new vector space U @ V “built” out 
of U and V. We will discuss the idea of this first and then give the precise definition. 

We wish to consider a formal “product” of an element u of U with an element v 
of V, denoted u & v. The space U ® V is then the space of linear combinations of 
such products, that is, the space of elements of the form 


aiu; @ Vy + AqU2 @ V2 + +++ + Anun Q Vn. (4.6) 
Of course, if “&” is to be interpreted as a product, then it should be bilinear: 


(u, + aw) @ v = uy Q v + am Q v, 
u Q (vy + av) = u Q vı + au @ v2. 


We do not assume that the product is commutative. That is to say, even if U = V so 
that U @ V and V @ U are the same space, u ® v will not, in general, equal v & u. 

Now, if €1,@2,...,@, is a basis for U and fi, f2,..., fm is a basis for V, then, 
using bilinearity, it is easy to see that any element of the form (4.6) can be written as 
a linear combination of the elements e; ® fp . In fact, it seems reasonable to expect 
that {e; 8 fk|l < j <n,1 <k <m} shouldbe a basis for the space U @ V. This, 
in fact, turns out to be the case. 


Definition 4.13. If U and V are finite-dimensional real or complex vector spaces, 
then a tensor product of U with V is a vector space W, together with a bilinear 
map ġ : U x V —> W with the following property: If y is any bilinear map of 
U x V into a vector space X, there exists a unique linear map Y of W into X such 
that the following diagram commutes: 
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UxV W 


YN ZY 
X 


Note that the bilinear map y from U x V into X turns into the linear map Y} of 
W into X. This is one of the points of tensor products: Bilinear maps on U x V turn 
into linear maps on W. 


Theorem 4.14. If U and V are any finite-dimensional real or complex vector 
spaces, then a tensor product (W, ġ) exists. Furthermore, (W, ġ) is unique up to 
canonical isomorphism. That is, if (W1, ġ1) and (W2, ġ2) are two tensor products, 
then there exists a unique vector space isomorphism ® : W, — Wn such that the 
following diagram commutes: 


uxv Ë Ww, 


hN fe: 
W 


Suppose that (W, ġ) is a tensor product and that e1, e2,...,en is a basis for U 
and fi, f2,..., fm is a basis for V. Then 


{olej fll <j <n, 1 <k<m} 


is a basis for W. 
In particular, dim (U & V) = (dim U) (dim V). 
Proof. Exercise 7. o 


Notation 4.15 Since the tensor product of U and V is essentially unique, we will 
let U @ V denote an arbitrary tensor product space and we will write u Q v instead 
of ġ (u, v). In this notation, Theorem 4.14 says that 


fe; @ fll <j<n1<k<m} 


is a basis for U ® V, as expected. 


The defining property of U @ V is called the universal property of tensor 
products. To understand the significance of this property, suppose we want to define 
a linear map T from U @ V into some other space. We could try to define T using 
bases for U and V, but then we would have to worry about whether T depends on 
the choice of basis. Instead, we could try to define T on elements of the form u @ v, 
with u € U and v €e V. While it follows from Theorem 4.14 that elements of this 
form span U @V, the decomposition of an element of U @ V as a linear combination 
of elements of the form u ® v is far from unique. Thus, if we wish to define T on 
such elements, we have to worry whether T is well defined. 
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This is where the universal property comes in. If y(u,v) is any bilinear 
expression in (u, v), the universal property says precisely that there is a unique linear 
map T (= w) such that 


T(u®@v) = y(u, v). 


Thus, we can construct a well-defined linear map T on U @ V simply by defining 
it on elements of the form u ® v, provided that our definition of T (u &® v) is bilinear 
in u and v. The following result is an application of this line of reasoning. 


Proposition 4.16. Let U and V be finite-dimensional real or complex vector 
spaces. Let A: U —> U and B : V — V be linear operators. Then there exists a 
unique linear operator from U &® V to U @ V, denoted A ® B, such that 


(A Q B)(u ® v) = (Au) ® (Bv) 


for allu € U and v € V. If A, and Az are linear operators on U and B; and B3 
are linear operators on V, then 


(Ai ® Bi)(A2 ® B2) = (A142) Q (By B2). (4.7) 
Proof. Define a map y from U x V into U @ V by 
wu, v) = (Au) @ (Bv). 


Since A and B are linear and since @ is bilinear, y is a bilinear map of U x V into 
U & V. By the universal property, there is an associated linear map wy : U @ V > 
U & V such that 


y(u 8 v) = y(u, v) = (Au) & (Bv). 


Thus, Y is the desired map A & B. An elementary calculation shows that the 
identity (4.7) holds on elements of the form u @ v. Since, by Theorem 4.14, such 
elements span U ® V, the identity holds in general. o 


We are now ready to define tensor products of representations. There are two 
different approaches to this, both of which are important. The first approach starts 
with a representation of a group G acting on a space U and a representation of 
another group H acting on a space V and produces a representation of the product 
group G x H acting on the space U @ V. The second approach starts with two 
different representations of the same group G, acting on spaces U and V, and 
produces a representation of G acting on U ® V. Both of these approaches can 
be adapted to apply to Lie algebras. 


Definition 4.17. Let G and H be matrix Lie groups. Let I; be a representation of 
G acting on a space U and let II; be a representation of H acting on a space V. 
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Then the tensor product of I; and I; is a representation IT; ® I of G x H 
acting on U @ V defined by 


(II; @ Mz)(A, B) = T (4) 8 I(B) 


forall A € G and B € H. 


Using Proposition 4.16, it is easy to check that Il; ® Iz is, in fact, a 
representation of G x H. 

Now, if G and H are matrix Lie groups (i.e., G is a closed subgroup of GL(n; C) 
and H is a closed subgroup of GL(m; C)), then G x H can be regarded in an obvious 
way as a closed subgroup of GL(n + m; C). Thus, the direct product of matrix Lie 
groups can be regarded as a matrix Lie group. It is easy to check that the Lie algebra 
of G x H is isomorphic to the direct sum of the Lie algebra of G and the Lie algebra 
of H. 


Proposition 4.18. Let G and H be matrix Lie groups with Lie algebras g and 
h, respectively. Let Tl, and Il, be representations of G and H, respectively, and 
consider the representation Il; ® Il of G x H. If mı ® m denotes the associated 
representation of g ® b, then 


(m @ m)(X,Y)=m(X)9I +I 8 m(Y) 


forall X € gandY €b. 


Proof. Suppose that u(t) is a smooth curve in U and v(t) is a smooth curve in V. 
Then, by repeating the proof of the product rule for scalar-valued functions (or by 
calculating everything in a basis), we have 


lugo) =Z evo tuo Z 
a eae ia " dt’ 
This being the case, we compute as follows: 
m 8 m2)(X, Y)(u® v) 


d 
= qe yu 8 T2(e”)v 


t=0 


d 
= (Gre 


d 
_) Gvtu® (Gm 


a 


This establishes the claimed form of (x1 ® 22)(X, Y ) on elements of the form u & v, 
which span U @ V. o 


The proposition motivates the following definition. 
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Definition 4.19. Let g and h be Lie algebras and let 2; and 7m2 be representations of 
g and b, acting on spaces U and V. Then the tensor product of z; and 2, denoted 
xı @ m, is a representation of g ® h acting on U @ V, given by 


(m O m)(X,Y) = m(X) 8I +18 m(Y) 


forall X € gand Y €b. 


It is easy to check that this indeed defines a representation of g ® h. Note that if 
we defined (77, ® 72)(X, Y) = mı (X) ® m2(Y), this would not be a representation 
of g ® b, since this is expression is not linear in (X, Y). 

We now define a variant of the above definitions in which we take the tensor 
product of two representations of the same group G and regard the result as a 
representation of G rather than of G x G. 


Definition 4.20. Let G be a matrix Lie group and let I], and I be representations 
of G, acting on spaces V; and V2. Then the tensor product representation of G, 
acting on V; ® V), is defined by 


(IT; 8 Mz)(A) = T (4) ® H2(A) 


for all A € G. Similarly, if 2; and m are representations of a Lie algebra g, we 
define a tensor product representation of g on V; ® V2 by 


(mO m)(X) = 1 (X) @I +I @ m(X). 


It is easy to check that II; ® II; and mı ® m are actually representations of 
G and g, respectively. The notation is, unfortunately, ambiguous, since if II; and 
IT, are representations of the same group G, we can regard II; ® Ip either as a 
representation of G or as a representation of G x G. We must, therefore, be careful 
to specify which way we are thinking about IT; ® Io. 

If I; and I; are irreducible representations of a group G, then IT; ® M will 
typically not be irreducible when viewed as a representation of G. One can, then, 
attempt to decompose IT; ® Iz as a direct sum of irreducible representations. This 
process is called the Clebsch—Gordan theory or, in the physics literature, “addition 
of angular momentum.” See Exercise 12 and Appendix C for more information 
about this topic. 


4.3.3 Dual Representations 


Suppose that z is a representation of a Lie algebra g acting on a finite-dimensional 
vector space V. Let V* denote the dual space of V, that is, the space of linear 
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functionals on V (Sect. A.7). If A is a linear operator on V, let A” denote the dual 
or transpose operator on V* , given by 


(A"$)(v) = $(Av) 


ford € V*,v EV. If v,..., vy, is a basis for V, then there is a naturally associated 
“dual basis” ¢1,...,@, with the property that ø; (vk) = 6%. The matrix for A” 
in the dual basis is then simply the transpose (not the conjugate transpose!) of the 
matrix of A in the original basis. If A and B are linear operators on V, it is easily 
verified that 


(AB)” = BA", (4.8) 


Definition 4.21. Suppose G is a matrix Lie group and II is a representation of G 
acting on a finite-dimensional vector space V. Then the dual representation I1* to 
TI is the representation of G acting on V* and given by 


T*(g) = [gD]. (4.9) 


If x is a representation of a Lie algebra g acting on a finite-dimensional vector space 
V, then z* is the representation of g acting on V* and given by 


n*(X) = —n(X)". (4.10) 


Using (4.8), it is easy to check that both II* and z* are actually representations. 
(Here the inverse on the right-hand side of (4.9) and the minus sign on the right-hand 
side of (4.10) are essential.) The dual representation is also called contragredient 
representation. 


Proposition 4.22. If I is a representation of a matrix Lie group G, then (1) TI* is 
irreducible if and only if TI is irreducible and (2) (11*)* is isomorphic to TI. Similar 
statements apply to Lie algebra representations. 


Proof. See Exercise 6. Oo 


4.4 Complete Reducibility 


Much of representation theory is concerned with studying irreducible represen- 
tations of a group or Lie algebra. In favorable cases, knowing the irreducible 
representations leads to a description of all representations. 


Definition 4.23. A finite-dimensional representation of a group or Lie algebra is 
said to be completely reducible if it is isomorphic to a direct sum of a finite number 
of irreducible representations. 


4.4 Complete Reducibility 91 


Definition 4.24. A group or Lie algebra is said to have the complete reducibility 
property if every finite-dimensional representation of it is completely reducible. 


As it turns out, most groups and Lie algebras do not have the complete reducibil- 
ity property. Nevertheless, many interesting example groups and Lie algebra do have 
this property, as we will see in this section and Sect. 10.3. 


Example 4.25. Let TI : R —> GL(2; C) be given by 


me) = (51): 


Then IT is not completely reducible. 


Proof. Direct calculation shows that TI is, in fact, a representation of R. If {e), e2} 
is the standard basis for C”, then clearly the span of e; is an invariant subspace. 
We now claim that (e1) is the only nontrivial invariant subspace for IT. To see this, 
suppose V is a nonzero invariant subspace and suppose V contains a vector not in 
the span of e1, say, v = ae; + bez with b Æ 0. Then 


I(1)v — v = be, 


also belongs to V. Thus, e; and e2 = (v — ae,)/b belong to V, showing that 
V = C?. We conclude, then, that C? does not decompose as a direct sum of 
irreducible invariant subspaces. o 


Proposition 4.26. If V is a completely reducible representation of a group or Lie 
algebra, then the following properties hold. 


1. For every invariant subspace U of V, there is another invariant subspace W 
such that V is the direct sum of U and W. 
2. Every invariant subspace of V is completely reducible. 


Proof. For Point 1, suppose that V decomposes as 
V = Ui U2 @®-::: @ Uk, 


where the U;’s are irreducible invariant subspaces, and that U is any invariant 
subspace of V. If U is all of V, then we can take W = {0} and we are done. If 
W # V, there must be some jı such that U;, is not contained in U. Since U;, is 
irreducible, it follows that the invariant subspace U;, N U must be {0}. Suppose now 
that U + U;, = V. If so, the sum is direct (since U;, N U = {0}) and we are done. 

If U + U; # V, there is some jz such that U + Uj, does not contain U;,, 
in which case, (U + Uj,) N U; = {0}. Proceeding on in the same way, we must 
eventually obtain some family jı, j2,..., Jı such that U + U;, +---+Uj, = V and 
the sum is direct. Then W := U;, +--+- + Uj, is the desired complement to U. 

For Point 2, suppose U is an invariant subspace of V. We first establish that U 
has the “invariant complement property” in Point 1. Suppose, then, that X is another 
invariant subspace of V with X C U. By Point 1, we can find invariant subspace Y 
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such that V = X @ Y. Let Z = Y N U, which is then an invariant subspace. We 
want to show that U = X @ Z. For all u € U, we can write u = x + y with x € X 
and y € Y. But since X C U, we have x € U and therefore y = u — x € U. Thus, 
y € Z = Y N U. We have shown, then, that every u € U can be written as the sum 
of an element of X and an element of Z. Furthermore, X N Z C X N Y = {0}, so 
actually U is the direct sum of X and Z. 

We may now easily show that U is completely reducible. If U is irreducible, we 
are done. If not, U has a nontrivial invariant subspace X and thus U decomposes 
as U = X © Z for some invariant subspace Z. If X and Z are irreducible, we 
are done, and if not, we proceed on in the same way. Since U is finite dimensional, 
this process must eventually terminate with U being decomposed as a direct sum of 
irreducibles. o 


Proposition 4.27. If G is a matrix Lie group and II is a finite-dimensional unitary 
representation of G, then II is completely reducible. Similarly, if g is a real Lie 
algebra and v is a finite-dimensional “unitary” representation of g (meaning that 
m(X)* = —2(X) for all X € g), then x is completely reducible. 


Proof. Let V denote the Hilbert space on which T acts and let (-,-) denote the 
inner product on V. If W C V be an invariant subspace, let WŁ be the orthogonal 
complement of W, so that V is the direct sum of W and W+. We claim that W+ is 
also an invariant subspace for II or z. 

To see this, note that since TI is unitary, 11(A)* = M(A)~! = M(A7!) for all 
A € G. Then, for any w € W and any v € W+, we have 


(TI(A)v, w) = (v, TI(A)*w) = (v, 1(A7!)w) 
= (v,w) =0. 


In the last step, we have used that w’ = TI(A7~!)w is in W, since W is invariant. 
This shows that II(A)v is orthogonal to every element of W, as claimed. A similar 
argument, with I1(A7!) replaced by —7r (X), shows that the orthogonal complement 
of an invariant subspace for zr is also invariant. 

We have established, then, that for a unitary representation, the orthogonal 
complement of an invariant subspace is again invariant. Suppose now that V is not 
irreducible. Then we can find an invariant subspace W that is neither {0} nor V, and 
we decompose V as W @ W+. Then W and W+ are both invariant subspaces and 
thus unitary representations of G in their own right. Then W is either irreducible or 
it splits as an orthogonal direct sum of invariant subspaces, and similarly for W+. 
We continue this process, and since V is finite dimensional, it cannot go on forever, 
and we eventually arrive at a decomposition of V as a direct sum of irreducible 
invariant subspaces. o 


Theorem 4.28. If G is a compact matrix Lie group, every finite dimensional 
representation of G is completely reducible. 


See also Sect. 10.3 for a similar result for semisimple Lie algebras. The argument 
below is sometimes called “Weyl’s unitarian trick” for the role of unitarity in the 
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proof. We require a notion of integration over matrix Lie groups that is invariant 
under the right action of the group. One way to construct such a right-invariant 
integral is to construct a right-invariant measure on G, known as a Haar measure. 
It is, however, simpler to introduce the integral by means of a right-invariant 
differential form on G. (See Appendix B for a quick introduction to the notion of 
differential forms.) 

If G C M,,(C) is a matrix Lie group, then the tangent space to G at the identity 
is the Lie algebra g of G (Corollary 3.46). It is then easy to see that the tangent 
space T4G at any point A € G is the space of vectors of the form XA with X € 
g. If the dimension of g as a real vector space is k, choose a nonzero k-linear, 
alternating form @ : g% — R. (Such form exists and is unique up to multiplication 
by a constant.) Then we may define a k-linear, alternating form a4 : (T4G)* > R 
by setting 


aa(Y%,...,¥%) = ar (Y1 A7!, ... , Yp A7!) 


for all Y1, ..., Yg € T4G. That is to say, we define « in an arbitrary nonzero fashion 
at the identity, and we then use the right action of G to “transport” œ to every other 
point in G. The resulting family of functionals is a k-form on G. 

Once such an a has been constructed, we can use it to construct an orientation 
on G, by decreeing that an ordered basis Y,,..., Y% for T4G is positively oriented 
if w4(%,...,¥) > 0. If f : G — R is a smooth function, we can integrate the 
k-form fa over nice domains in G. If G is compact, we may fa integrate over all 
of G, leading to a notion of integration, which we denote as 


| fae. 
G 


Since the orientation on G was defined in terms of the k-form æ« itself, it is not 
hard to see that if f(A) > 0 forall A, then fẹ fa > 0. Furthermore, since the form 
a was constructed using the right action of G, it is easily seen to be invariant under 
that action. As a result, the notion of integration of a function over a compact group 
G is invariant under the right action of A: For all B € G, we have 


f fanaa = f faa. 
G G 


Proof of Theorem 4.28. Choose an arbitrary inner product (-,-) on V, and then 
define a map (:,-)g : V x V — C by the formula 


(v,W)¢ = f (TA, TAW) (A), 


It is easy to check that (-, -)¢ is an inner product; in particular, the positivity of (-,-)¢ 
holds because (II(A)v, II(A)v) > 0 for all A if v # 0. We now compute that for 
each B € G, we have 
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(T1(B)v, 1(B)w) ¢ = f (MATE, ATB) (A) 
= (TI(AB)v, TI(AB)w) a (A) 
G 


= f Av, Aw) a4) 
— (U,W)g, 


where we have used the right invariance of the integral in the third equality. This 
computation shows that for each B € G, the operator II(B) is unitary with respect 
to (-,-)g. Thus, by Proposition 4.27, TI is completely reducible. o 


Note that compactness of the group G is needed to ensure that the integral 
defining (-, +) g is convergent. 


4.5 Schur’s Lemma 


It is desirable to be able to state Schur’s lemma simultaneously for groups and Lie 
algebras. In order to do so, we need to indulge in a common abuse of notation. 
If, say, II is a representation of G acting on a space V, we will refer to V as the 
representation, without explicit reference to II. 


Theorem 4.29 (Schur’s Lemma). 


1. Let V and W be irreducible real or complex representations of a group or Lie 
algebra and let 6 : V — W be an intertwining map. Then either ¢ = 0 or @ is 
an isomorphism. 

2. Let V be an irreducible complex representation of a group or Lie algebra and 
let : V — V be an intertwining map of V with itself. Then @ = XI, for some 
AEC. 

3. Let V and W be irreducible complex representations of a group or Lie algebra 
and let ¢,,¢2 : V — W be nonzero intertwining maps. Then $, = Àq, for 
some À € C. 


It is important to note that the last two points in the theorem hold only over C (or 
some other algebraically closed field) and not over R. See Exercise 8. 
Before proving Schur’s lemma, we obtain two corollaries of it. 


Corollary 4.30. Let II be an irreducible complex representation of a matrix Lie 
group G. If A is in the center of G, then TI(A) = AI, for some À € C. Similarly, 
if x is an irreducible complex representation of a Lie algebra g and if X is in the 
center of g, then n(X) = XI. 
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Proof. We prove the group case; the proof of the Lie algebra case is similar. If A is 
in the center of G, then for all B € G, 


TI(A)I1(B) = T1(AB) = (BA) = T1(B) H(A). 


However, this says exactly that IT(A) is an intertwining map of the space with itself. 
Thus, by Point 2 of Schur’s lemma, II(A) is a multiple of the identity. o 


Corollary 4.31. An irreducible complex representation of a commutative group or 
Lie algebra is one dimensional. 


Proof. Again, we prove only the group case. If G is commutative, the center of 
G is all of G, so by the previous corollary TI(A) is a multiple of the identity for 
each A € G. However, this means that every subspace of V is invariant! Thus, 
the only way that V can fail to have a nontrivial invariant subspace is if it is one 
dimensional. o 


We now provide the proof of Schur’s lemma. 


Proof of Theorem 4.29. As usual, we will prove just the group case; the proof of 
the Lie algebra case requires only the obvious notational changes. For Point 1, if 
v € ker(ġ), then 


p(I1(A)v) = X(A)g(v) = 0. 


This shows that ker @ is an invariant subspace of V. Since V is irreducible, we must 
have kero = 0 or ker = V. Thus, ¢ is either one-to-one or zero. 

Suppose ¢ is one-to-one. Then the image of ¢ is a nonzero subspace of W. On 
the other hand, the image of ¢ is invariant, for if w € W is of the form ¢(v) for 
some v € V, then 


X(A)w = X (A) (v) = P(II(A)v). 


Since W is irreducible and image(V) is nonzero and invariant, we must have 
image(V) = W. Thus, ¢ is either zero or one-to-one and onto. 

For Point 2, suppose V is an irreducible complex representation and that ġ : 
V — V is an intertwining map of V to itself, that is that @I11(A) = II(A)¢ for 
all A € G. Since we are working over an algebraically closed field, @ must have 
at least one eigenvalue A € C. If U denotes the corresponding eigenspace for ¢, 
then Proposition A.2 tells us that each II(A) maps U to itself, meaning that U is an 
invariant subspace. Since À is an eigenvalue, U 4 0, and so we must have U = V, 
which means that @ = AJ on all of V. 

For Point 3, if 62 Æ 0, then by Point 1, ¢2 is an isomorphism. Then ¢) o $3 1 
is an intertwining map of W with itself. Thus, by Point 2, 1 o ġ; 1 — AI, whence 


i = Ado. o 
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In this section, we will compute (up to isomorphism) all of the finite-dimensional 
irreducible complex representations of the Lie algebra sl(2; C). This computation is 
important for several reasons. First, Sl(2; C) is the complexification of Su(2), which 
in turn is isomorphic to so(3), and the representations of so(3) are of physical 
significance. Indeed, the computation we will perform in the proof of Theorem 4.32 
is found in every standard textbook on quantum mechanics, under the heading 
“angular momentum.” Second, the representation theory of Su(2) is an illuminating 
example of how one uses commutation relations to determine the representations of 
a Lie algebra. Third, in determining the representations of a semisimple Lie algebra 
g (Chapters 6 and 7), we will make frequent use of the representation theory of 
sl(2; C), applying it to various subalgebras of g that are isomorphic to sl(2; C). 
We use the following basis for sl(2; C): 


yet: walk galt 2%), 
00 10 0-1 


which have the commutation relations 


[H, X] = 2X, 
[H, Y] = -2Y, (4.11) 
[X.Y]= H. 


If V is a finite-dimensional complex vector space and A, B, and C are operators on 
V satisfying [A, B] = 2B, [A, C] = —2C, and [B,C] = A, then because of the 
skew symmetry and bilinearity of brackets, the unique linear map z : sl(2;C) > 
gl(V) satisfying 


mH)=A, x(X)=B, a(Y)=C 


will be a representation of sl(2; C). 


Theorem 4.32. For each integer m > O, there is an irreducible complex represen- 
tation of S\(2; C) with dimension m +1. Any two irreducible complex representations 
of sl(2; C) with the same dimension are isomorphic. If x is an irreducible complex 
representation of S\(2;C) with dimension m + 1, then x is isomorphic to the 
representation Tm described in Sect. 4.2. 


Our goal is to show that any finite-dimensional irreducible representation of 
sl(2; C) “looks like” one of the representations mm coming from Example 4.10. In 
that example, the space V,,, is spanned by eigenvectors for z,,() and the operators 
Im(X) and nm(Y) act by shifting the eigenvalues up or down in increments of 2. 
We now introduce a simple but crucial lemma that allows us to develop a similar 
structure in an arbitrary irreducible representation of sl(2; C). 
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Lemma 4.33. Let u be an eigenvector of n(H) with eigenvalue a € C. Then we 
have 


n(H)a(X)u = (a + 2)0(X)u. 


Thus, either m(X)u = 0 or n(X)u is an eigenvector for n(H) with eigenvalue 
a + 2. Similarly, 


u(A)n(Y)u = (a — 2)n(Y)u, 


so that either n(Y )u = 0 or n(Y )u is an eigenvector for m(H) with eigenvalue 
a — 2. 


Proof. We know that [7(), n(X)] = x ((H, X]) = 27(X). Thus, 


mu(H)n(X )u = n(X)n(H)u + 20 (X)u 
= n (X) (au) + 20(X)u 
= (a + 2)m(X)u. 
The argument with 2(X) replaced by 2(Y) is similar. o 


Proof of Theorem 4.32. Let a be an irreducible representation of sl(2;C) acting 
on a finite-dimensional complex vector space V. Our strategy is to diagonalize the 
operator z (H). Since we are working over C, the operator z (H) must have at least 
one eigenvector. Let u be an eigenvector for m(H) with eigenvalue a. Applying 
Lemma 4.33 repeatedly, we see that 


n(H)n(X)ku = (a + 2k)n(X)ku. 


Since operator on a finite-dimensional space can have only finitely many eigenval- 
ues, the 7(X u's cannot all be nonzero. Thus, there is some N > 0 such that 


nm(X)Nu 40 
but 
a(X)N t'u = 0. 
If we set up = m(X)Nu and A = a + 2N, then, 


1(H)ug = Aug, (4.12) 
1(X)uy = 0. (4.13) 
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Let us then define 
uz = 1(Y)* ug 
for k > 0. By Lemma 4.33, we have 
m(H)ug = (A — 2k) ux. (4.14) 
Now, it is easily verified by induction (Exercise 3) that 
U(X )up = k[A—(k — 1)ļuk-ı (kK = 1). (4.15) 


Furthermore, since z(#) can have only finitely many eigenvalues, the ug’s cannot 
all be nonzero. There must, therefore, be a non-negative integer m such that 


ug = n(Y kup #0 
for all k < m, but 
tne = n(Y)” t'u = 0. 
If un+1 = 0, then m (X )um+1ı = 0 and so, by (4.15), 
0 = n (X )um+ı = (Mm + IA — m)um. 

Since um and m + | are nonzero, we must have à — m = 0. Thus, A must coincide 
with the non-negative integer m. 

Thus, for every irreducible representation (7, V), there exists an integer m > 0 


and nonzero vectors uo, ..., Um such that 


(A )uzp = (m — 2k)uk 


Uk+1 ifk<m 
Y = 
eas 0 ifk=m 
k(m — (k —1))up_-1 fk > 0 
X)up = ; 4.1 
ELA i 0 ifk =0 — 
The vectors uo,..., Um must be linearly independent, since they are eigenvectors 
of 2(H) with distinct eigenvalues (Proposition A.1). Moreover, the (m + 1)- 
dimensional span of uo, ... , Um is explicitly invariant under x (H), 7(X), and x (Y) 


and, hence, under 7 (Z) for all Z € sl(2; C). Since z is irreducible, this space must 
be all of V. 

We have shown that every irreducible representation of sl(2;C) is of the 
form (4.16). Conversely, if we define n(H), m(X), and z(Y) by (4.16) (where 
the ux’s are basis elements for some (m + 1)-dimensional vector space), it is 
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not hard to check that operators defined as in (4.16) really do satisfy the sl(2; C) 
commutation relations (Exercise 4). Furthermore, the we may prove irreducibility 
of this representation in the same way as in the proof of Proposition 4.11. 

The preceding analysis shows that every irreducible representation of dimension 
m + 1 must have the form in (4.16), which shows that any two such representations 
are isomorphic. In particular, the (m + 1)-dimensional representation m,m described 
in Sect. 4.2 must be isomorphic to (4.16). 

This completes the proof of Theorem 4.32. oO 


As mentioned earlier in this section, the representation theory of sl(2;C) plays a 
key role in the representation theory of other Lie algebras, such as sl(3; C), because 
these Lie algebras contain subalgebras isomorphic to sI(2; C). For such applications, 
we need a few results about finite-dimensional representations of sl(2;C) that are 
not necessarily irreducible. 


Theorem 4.34. If (x, V) is a finite-dimensional representation of s\(2;C), not 
necessarily irreducible, the following results hold. 


1. Every eigenvalue of (#2) is an integer. Furthermore, if v is an eigenvector for 
(#1) with eigenvalue À and n(X )v = 0, then À is a non-negative integer. 

2. The operators m(X) and n(Y ) are nilpotent. 

3. If we define S : V —> V by 


S = e70 e70) er), 
then S satisfies 
Sx(H)S~! = —n(H). 
4. If an integer k is an eigenvalue for 1(), so is each of the numbers 
—|k|,—|kK| +2,...,|k| —2, |k]. 


Since SU(2) is simply connected, Theorem 5.6 will tell us that the representa- 
tions of sl(2;C) = su(2)c are in one-to-correspondence with the representations 
of SU(2). Since SU(2) is compact, Theorem 4.28 then tells us that every repre- 
sentation of sl(2;C) is completely reducible. If we decompose V as a direct sum 
of irreducibles, it is easy to prove the theorem for each summand separately. It is, 
however, preferable to give a proof of the theorem that does not rely on Theorem 5.6, 
which in turn relies on the Baker-Campbell—Hausdorff formula. 

See also Exercise 13 for a different approach to the first part of Point 1, and 
Exercise 14 for a different approach to Point 3. 


Proof. For Point 1, suppose v is an eigenvalue of 7(H) with eigenvalue À. Then 
there is some N > 0 such that 1(X)" v Æ 0 but 2(X)%*!v = 0, where (X) v 
is an eigenvector of (H) with eigenvalue A + 2N. The proof of Theorem 4.32 
shows that m := à + 2N must be a non-negative integer, so that A is an integer. If 
mz(X)v = 0 then we take N = 0 and A = m is non-negative. 
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For Point 2, it follows from the SN decomposition (Sect. A.3) that 7(H) has a 
basis of generalized eigenvectors, that is, vectors v for which (a (H) — AI)*v = 0 
for some positive integer k. But, using the commutation relation [H, X] = 2X and 
induction on k, we can see that 


[x(H) — (A + 2)1]'n(X) = 2 (X)[2(H) — AI. 
Thus, if v is a generalized eigenvector for 7(H) with eigenvalue A, then 2(X)uv 
is either zero or a generalized eigenvector with eigenvalue A + 2. Applying z (X) 
repeatedly to a generalized eigenvector for x(H) must eventually give zero, since 
zx(H) can have only finitely many generalized eigenvalues. Thus, x (X) is nilpotent. 
A similar argument applies to z(Y). 
For Point 3, we note that 
Sa(H)S7! = e™ Me FM oO g( Hye FO FM e700, (4.17) 
By Proposition 3.35, we have 
e™On(H)e™) = Ad ax (a(H)) = e*" (1 (H)) 
and similarly for the remaining products in (4.17). 
Now, ady(X) = 0, ady(H) = —2X and ady(Y) = H, and similarly with 7 
applied to each Lie algebra element. Thus, 
etw (7(H)) = n(H) — 2n(X). 
Meanwhile, ady (X) = —H, ady (H) = 2Y, and ady (Y ) = 0. Thus, 
e*4n) (¢(H) — 27(X)) 


= m(A) — 2n(X) — 2n(Y) — 20(A) + ln) 


II 


—nr(H)—2r(X). 
Finally, 


e0 (—a(H) — 27 (X)) = -n (H) — 21 (X) + 27 (X) 
= =n (H), 


which establishes Point 3. 

For Point 4, assume first that k is non-negative and let v be an eigenvector for 
zx(H) with eigenvalue k. Then as in Point 1, there is then another eigenvector vo 
for (H) with eigenvalue m := k + 2N > k and such that m(X)vo = 0. Then 
by the proof of Theorem 4.32, we obtain a chain of eigenvectors vo, v1, ..., Um for 
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z (H) with eigenvalues ranging from m to —m in increments of 2. These eigenvalues 
include all of the numbers k, k —2,...,—k. If k is negative and v is an eigenvector 
for 2(H) with eigenvalue k, then Sv is an eigenvector for 7() with eigenvalue 
|k|. Hence, by the preceding argument, each of the numbers from |k| to —|A| in 
increments of 2 is an eigenvalue. o 


4.7 Group Versus Lie Algebra Representations 


We know from Chapter 3 (Theorem 3.28) that every Lie group homomorphism gives 
rise to a Lie algebra homomorphism. In particular, every representation of a matrix 
Lie group gives rise to a representation of the associated Lie algebra. In Chapter 5, 
we will prove a partial converse to this result: If G is a simply connected matrix 
Lie group with Lie algebra g, every representation of g comes from a representation 
of G. (See Theorem 5.6). Thus, for a simply connected matrix Lie group G, there 
is a natural one-to-one correspondence between the representations of G and the 
representations of g. 

It is instructive to see how this general theory works out in the case of SU(2) 
(which is simply connected) and SO(3) (which is not). For every irreducible 
representation z of Su(2), the complex-linear extension of z to sl(2;C) must 
be isomorphic (Theorem 4.32) to one of the representations z,, described in 
Sect.4.2. Since those representations are constructed from representations of the 
group SU(2), we can see directly (without appealing to Theorem 5.6) that every 
irreducible representation of Su(2) comes from a representation of SU(2). 

Now, by Example 3.27, there is a Lie algebra isomorphism ¢ : Su(2) —> so(3) 
such that O(£;) = F;, 7 = 1,2,3, where {£), E2, £3} and {F;, Fo, F3} are 
the bases listed in the example. Thus, the irreducible representations of so(3) are 
precisely of the form Oom = m o ¢~!. We wish to determine, for a particular 
m, whether or not there is a representation X,, of the group SO(3) such that 
Èm (exp X) = exp(o,(X)) for all X in so(3). 


Proposition 4.35. Let Om = Tm 0! be an irreducible complex representations 
of the Lie algebra so(3) (m > 0). If m is even, there is a representation Xm of the 
group SO(3) such that Zm (e¥) = e%®) for all X in so(3). If m is odd, there is no 
such representation of SO(3). 


Note that the condition that m be even is equivalent to the condition that 
dim Vp = m + 1 be odd. Thus, it is the odd-dimensional representations of the 
Lie algebra so(3) which come from group representations. In the physics literature, 
the representations of su(2)  so(3) are labeled by the parameter / = m/2. In 
terms of this notation, a representation of S0(3) comes from a representation of 
SO(3) if and only if / is an integer. The representations with / an integer are called 
“integer spin”; the others are called “half-integer spin.” 

For any m, one could attempt to construct Xm by the construction in the proof 
of Theorem 5.6. The construction is based on defining ¥,,(A) along a path joining 
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I to A and then proving that the value of X, (A) is independent of the choice of path. 
The construction of £, along a path goes through without change. Since SO(3) is 
not simply connected, however, two paths in SO(3) with the same endpoint are 
not necessarily homotopic with endpoints fixed and the proof of independence of 
the path breaks down. One can join the identity to itself, for example, either by 
the constant path or by the path consisting of rotations by angle 27t in the (y, z)- 
plane, 0 < t < 1. If one defines £, along the constant path, one gets the value 
XUm() = I. If m is odd, however, and one defines Xn along the path of rotations 
in the (y, z)-plane, then one gets the value £n (Z) = —Z, as the calculations in the 
proof of Proposition 4.35 will show. This strongly suggests (and Proposition 4.35 
confirms) that when m is odd, there is no way to define X, as a “single-valued” 
representation of SO(3). 

An electron, for example, is a “spin one-half” particle, which means that it 
is described in quantum mechanics in a way that involves the two-dimensional 
representation 0; of so(3). In the physics literature, one finds statements to the effect 
that applying a 360° rotation to the wave function of the electron gives back the 
negative of the original wave function. This statement reflects that if one attempts to 
construct the nonexistent representation ©, of SO(3), then when defining ©, along 
a path of rotations in the (x, y)-plane, one gets that X (7) = —I. 


Proof. Suppose, first, that m is odd and suppose that there a X, existed. Computing 
as in Sect. 2.2, we see that 


1 0 0 
0 cos2z —sin2x | = 1. 
0 sin2z cos2z 


eet = 


Meanwhile, om (F1) = tm(@~'(F,)) = Am(E1), with E; = iH /2, where, as usual, 
H is the diagonal matrix with diagonal entries (1,—1). We know that there is a 
basis uo, u1, . . . , Um for Vm such that ux is an eigenvector for 7, (1) with eigenvalue 
m — 2j. This means that u; is also an eigenvector for Om(F1) = inm(H)/2, with 
eigenvalue i (m — 2j )/2. Thus, in the basis {u;}, we have 
im 
i(m — 2) 


N 


Om(Fı) = 

5(=m) 
Since we are assuming m is odd, m — 2j is an odd integer. Thus, e?7°”*)) has 
eigenvalues e?™7i0"—2j)/2 = —1 in the basis {u;}, showing that e770) = —], 


Thus, on the one hand, 


Ene") = Enl) = I, 
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whereas, on the other hand, 
ye") = e27om (Fi) = -]. 


Thus, there can be no such group representation Xn. 

Suppose now that m is even. Recall that the Lie algebra isomorphism ¢ comes 
from the surjective group homomorphism ® in Proposition 1.19, where ker(®) = 
{I,—I}. Let TI, be the representation of SU(2) in Example 4.10. Now, e?7"! = 
—I, and, thus, 


InI) = Inm (e) = greeny, 
If, however m is even, then e””@#1) is diagonal in the basis {u j} with eigenvalues 
e2ti(m—2j)/2 — 1, showing that II(—J) = e™@74)) = J, 
Now, for each R € SO(3), there is a unique pair of elements {U, —U} such that 
(U) = O(-U) = R. Since II,,(—I) = J, we see that T,(U) = T,(—U). It 
thus makes sense to define 


Em (R) = Hn(U). 


It is easy to see that X, is a Lie group homomorphism, and, by construction, we 


have Im = Xm o ®. Thus, the Lie algebra representation Om associated to ©, 
satisfies Mm = O° OF Om = mop, showing that &,,, is the desired representation 
of SO(3). oO 


4.8 A Nonmatrix Lie Group 


In this section, we will show that the Lie group introduced in Sect. 1.5 is not 
isomorphic to a matrix Lie group. (The universal cover of SL(2;R) is another 
example of a Lie group that is not a matrix Lie group; see Sect.5.8.) The group 
in question is G = R x R x S!, with the group product defined by 


(x1, V1, U1) (X2, Y2, U1) = (X1 + x2, Yı + y2, e"? uy U2). 


Meanwhile, let H be the Heisenberg group and consider the map ® : H > G 
given by 


lab 
| 01c |= (a,c, e). 
001 
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Direct computation shows that ® is a homomorphism. The kernel of ® is the 
discrete normal subgroup N of H given by 


N= 010 ]ineZ 


Now, suppose that X is any finite-dimensional representation of G. Then we can 
define an associated representation II of H by II = Xo @®. Clearly, the kernel 
of any such representation of H must include the kernel N of ®. Now, let Z(H) 
denote the center of H , which is easily shown to be 


10b 
Z(H)=2[010]||beR 
001 


Theorem 4.36. Let II be any finite-dimensional representation of H. If ker 1 > 
N, thenker TI D Z(H). 


Once this is established, we will be able to conclude that there are no faithful 
finite-dimensional representations of G. After all, if & is any finite-dimensional 
representation of G, then the kernel of II = £ o ® will contain N and, thus, Z(H), 
by the theorem. Thus, for all b € R, 


10b 
{010 |= E£(0,0,e®) = I. 
001 


This means that the kernel of © contains all elements of the form (0, 0, u) and È is 
not faithful. Thus, we obtain the following result. 


Corollary 4.37. The Lie group G has no faithful finite-dimensional representa- 
tions. In particular, G is not isomorphic to any matrix Lie group. 


We now begin the proof of Theorem 4.36. 


Lemma 4.38. If X is a nilpotent matrix and e% = I for some nonzero t, then 
X =0. 


Proof. Since X is nilpotent, the power series for e terminates after a finite number 
of terms. Thus, each entry of e% depends polynomially on f; that is, there exist 
polynomials pj(t) such that (e™) = pjx(t). If e% = I for some nonzero t, then 
e"* = J for alln € Z, showing that pj(nt) = 5, for all n. However, a polynomial 
that takes on a certain value infinitely many times must be constant. Thus, actually, 


e* = I forall t, from which it follows that X = 0. oO 
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Proof of Theorem 4.36. Let x be the associated representation of the Lie algebra h 
of H. Let {X, Y, Z} be the following basis for b: 


010 000 001 
X=|]000], Y={]001], Z=] 000]. (4.18) 
000 000 000 


These satisfy the commutation relations [X, Y] = Z and [X, Z] = [Y, Z] = 0. 

We now claim that 7(Z) must be nilpotent, or, equivalently, that all of the 
eigenvalues of 2(Z) are zero. Let A be an eigenvalue for 7(Z) and let V} be the 
associated eigenspace. Then V} is certainly invariant under z(Z). Furthermore, 
since z(X) and z(Y) commute with 2(Z), Proposition A.2 tells us that V} is 
invariant under x (X ) and z(Y ). Thus, the restriction of z (Z) to V,;—namely, A J— 
is the commutator of the restrictions to V} of 7(X) and 2(Y). Since the trace of a 
commutator is zero, we have 0 = A dim(V,) and A must be zero. 

Now, direct calculation shows that e”Ž belongs to N for all integers n. Thus, if 
TI is a representation of H and ker I D N, we have TI (e”7) = I for all n. Since 
z (Z) is nilpotent, Lemma 4.38 tells us that 2(Z) is zero and thus that (e'7) = 
e'™(Z) = I forall t € R. Since every element of Z(H) is of the form e'“ for some 
t, we have the desired conclusion. oO 


4.9 Exercises 


1. Prove Point 2 of Proposition 4.5. 

2. Show that the adjoint representation and the standard representation are 
isomorphic representations of the Lie algebra So(3). Show that the adjoint and 
standard representations of the group SO(3) are isomorphic. 

3. Using the commutation relation [7(X), 2(Y)] = (A) and induction on k, 
verify the relation (4.15). 

4. Define a vector space with basis uo, u1, ..., Um. Now, define operators n (H), 
m(X), and z(Y ) by formula (4.16). Verify by direct computation that the oper- 
ators defined by (4.16) satisfy the commutation relations (4.11) for sl(2; ©). 
Hint: When dealing with 7 (Y ), treat the case of ug, k < m, separately from the 
case of um, and similarly for 7(X). 

5. Consider the standard representation of the Heisenberg group, acting on C?. 
Determine all subspaces of C? which are invariant under the action of the 
Heisenberg group. Is this representation completely reducible? 

6. Prove Proposition 4.22. 

Hint: There is a one-to-one correspondence between subspaces of V and 
subspaces of V* as follows: For any subspace W of V, the annihilator of W is 
the subspace of all ¢ in V* such that ¢ is zero on W. See Sect. A.7. 

7. Prove Theorem 4.14. 
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10. 


11. 


12. 


13. 
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Hints: For existence, choose bases {e;} and {f} for U and V. Then 
define a space W which has as a basis {wy |1 < j <n,1<k <m}. Define 
(ej, fk) = wy and extend by bilinearity. For uniqueness, use the universal 
property. 


. Let SO(2) act on R? in the obvious way. Show that R? is an irreducible 


real representation under this action, but that Point 2 of Schur’s lemma 
(Theorem 4.29) fails. 


. Suppose V is a finite-dimensional representation of a group or Lie algebra and 


that W is a nonzero invariant subspace of V. Show that there exists a nonzero 
irreducible invariant subspace for V that is contained in W. 

Suppose that Vj; and Vz are nonisomorphic irreducible representations of a 
group or Lie algebra, and consider the associated representation V; ® V2. 
Regard V; and V2 as subspaces of V; ® V2 in the obvious way. Following the 
outline below, show that V; and V2 are the only nontrivial invariant subspaces 
of Vi @ V2. 


(a) First assume that U is a nontrivial irreducible invariant subspace. Let P; : 
Vi ® V2 — V; be the projection onto the first factor and let P) be the 
projection onto the second factor. Show that P; and P, are intertwining 
maps. Show that U = V; or U = V. 

(b) Using Exercise 9, show that V; and V2 are the only nontrivial invariant 
subspaces of V; @ V2. 


Suppose that V is an irreducible finite-dimensional representation of a group or 
Lie algebra over C, and consider the associated representation V ® V. Show 
that every nontrivial invariant subspace U of V @ V is isomorphic to V and is 
of the form 


U = {(Aiv, Aav)|v € V} 


for some constants A; and A>, not both zero. 

Recall the spaces Vm introduced in Sect.4.2, viewed as representations of 
the Lie algebra sl(2;C). In particular, consider the space V; (which has 
dimension 2). 


(a) Regard V; ® V; as a representation of sl(2; C), as in Definition 4.20. Show 
that this representation is not irreducible. 

(b) Now, view V; @ Vı as a representation of sl(2;C) @ sl(2;C), as in 
Definition 4.19. Show that this representation is irreducible. 


Let (II, V) be a finite-dimensional representation of SU(2) with associated rep- 
resentation x of Su(2), which extends by complex linearity to sl(2; C). (Since 
SU(2) is simply connected, Theorem 5.6 will show that every representation 
x of sl(2;C) arise in this way.) If H is the diagonal matrix with diagonal 
entries (1,—1), show that e?™™'¥ = J and use this to prove (independently of 
Theorem 4.34) that every eigenvalue of x (H) is an integer. 
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14. Let (II, V) be a finite-dimensional representation of SU(2) with associated 
representation x of Su(2), which extends by complex linearity to sl(2; C). If 
X,Y, and H are the usual basis element for sI(2; C), compute ee `Y e* and 
show that 


e¥e Y e*H(e¥e Ye*%) | = —-H. 


Use this result to give a different proof of Point 3 of Theorem 4.34. 


Chapter 5 
The Baker—Campbell—Hausdorff Formula 
and Its Consequences 


5.1 The “Hard” Questions 


Consider three elementary results from the preceding chapters of this book: (1) 
Every matrix Lie group G has a Lie algebra g. (2) A continuous homomorphism 
® between matrix Lie groups G and H gives rise to a Lie algebra homomorphism 
$ : g — b. (3) If G and H are matrix Lie groups and H is a subgroup of G, 
then the Lie algebra h of H is a subalgebra of the Lie algebra g of G. Each of these 
results goes in the “easy” direction, from a group notion to an associated Lie algebra 
notion. In this chapter, we attempt to go in the “hard” direction, from the Lie algebra 
to the Lie group. We will investigate three questions relating to the preceding three 
theorems. 


e Question 1: Is every finite-dimensional, real Lie algebra the Lie algebra of some 
matrix Lie group? 

e Question 2: Let G and H be matrix Lie groups with Lie algebras g and b, 
respectively, and let ¢ : g — þh be a Lie algebra homomorphism. Does there 
exists a Lie group homomorphism ® : G —> H such that ®(e*) = e?) for all 
X€g? 

e Question 3: If G is a matrix Lie group with Lie algebra g and h is a subalgebra 
of g, is there a matrix Lie group H C G whose Lie algebra is h? 


The answer to Question 1 is yes; see Sect. 5.10. The answer to Question 2 is, 
in general, no, but is yes if G is simply connected; see Sect. 5.7. The answer to 
Question 3 is no, in general, but is yes if we allow H to be a “connected Lie 
subgroup” that is not necessarily closed; see Sect. 5.9. 

Our tool for investigating these questions is the Baker-Campbell—Hausdorff 
formula, which expresses log(e* e” ), where X and Y are sufficiently small n x n 
matrices, in Lie-algebraic terms, that is, in terms of iterated commutators involving 
X and Y. The formula implies that all information about the product operation on 
a matrix Lie group, at least near the identity, is encoded in the Lie algebra. In the 
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case of Questions 2 and 3 in the preceding paragraph, we will give a complete proof 
of the theorem that answers the question. In the case of Question 1, we will need to 
assume Ado’s theorem, which asserts that every finite-dimensional real Lie algebra 
is isomorphic to an algebra of matrices. 


5.2 An Illustrative Example 


In this section, we prove one of the main theorems alluded to above (the answer 
to Question 2), in the case of the Heisenberg group. We introduce the problem in 
a general setting and then specialize to the Heisenberg case. Suppose G and H 
are matrix Lie groups with Lie algebras g and bh, respectively, and suppose ¢@ : 
g — 6 is a Lie algebra homomorphism. We would like to construct a Lie group 
homomorphism ® : G —> H such that ®(e*) = e? for all X in g. In light of 
Theorem 3.42, we can define a map ® from a neighborhood U of the identity in G 
into H as follows: 


(A) = ef 084, (5.1) 
so that 
O(e*) = e? (5.2) 


at least for sufficiently small X. 

A key issue is then to show that ®, as defined near the identity by (5.1) or (5.2), is 
a “local homomorphism.” Suppose A = e* and B = e’, where X, Y € g are small 
that e”, e”, and e*e” are all in the domain of ®. To compute ®(AB) = ®(e*e"), 
we need to express e*e” in the form e%, so that ®(e*e”) will equal e?”). The 
Baker—Campbell—Hausdorff formula states that for sufficiently small X and Y, we 
have 


Z = log(e*e’) 
= X +Y + i[X, Y] + 5X. [X Y]] - [Y [X,Y] +, (5.3) 


where the “- - -” refers to additional terms involving iterated brackets of X and Y. (A 
precise statement of and a proof of the formula will be given in subsequent sections.) 
If ¢ is a Lie algebra homomorphism, then 


¢ (log(e*e”)) = G(X) + 9Y) + 519(X), 9] 
+ De 16(X), GON] — GIO), 160), 6) + + 
= loge? oF), (5.4) 


It then follows that b(e* e”) = e® Me? = P(e*)H(e"). 
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In the general case, it requires considerable effort to prove the Baker—-Campbell— 
Hausdorff formula and then to prove that, when G is simply connected, ® can 
be extended to all of G. In the case of the Heisenberg group (which is simply 
connected), the argument can be greatly simplified. 


Theorem 5.1. Suppose X and Y are n x n complex matrices, and that X and Y 
commute with their commutator: 


LX, [X, ¥]] = IY, [X, Y]] = 0. (5.5) 


Then we have 


1 
et et = eX +tY+3lXY] 


This is the special case of (5.3) in which the series terminates after the [X, Y] 
term. See Exercise 5 for another special case of the Baker—Campbell—Hausdorff 
formula. 


Proof. Consider X and Y in M,,(C) satisfying (5.5). We will prove that 
2 
ete = exp( ox +Y + a LX, r1) ? 


which reduces to the desired result in the case £ = 1. Since, by assumption, [X, Y] 
commutes with X and Y, the above relation is equivalent to 


2 
Xo FIKY] L et X+Y), (5.6) 


Let us denote by A(t) the left-hand side of (5.6) and by B (t) the right-hand side. Our 
strategy will be to show that A (t) and B (t) satisfy the same differential equation, 
with the same initial conditions. We can see immediately that 

dB 


a PO ET): 


On the other hand, differentiating A(t) by means of the product rule gives 


- = eX Xe e- 7 XY] a eXe Ye- FIX! 
2 
+ eže eT TXY (—t[X,Y]). (5.7) 
(The correctness of the last term may be verified by differentiating e" XY] term 
—t?[X,Y]/2 


by term.) Now, since Y commutes with [X, Y], it also commute with e 
Thus, the second term on the right in (5.7) can be rewritten as 


2 
Xe e TX y. 
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For the first term on the right-hand side of (5.7), we compute, using Proposition 3.35, 
that 


Xe” = ee” Xe" 
= e” Ad,- (X) 
= el e—tady (X). 


Since [Y, [Y, X]] = —[Y, [X, Y]] = 0, we have 
ey (x) = X—t[Y,X] = X +t [X,Y], 
with all higher terms being zero. We may then simplify (5.7) to 


dA 2 
a eX ee TINK +Y) = AMX +Y). 

We see, then, that A(t) and B(t) satisfy the same differential equation, with the 
same initial condition A(0) = B(O) = J. Thus, by standard uniqueness results for 
(linear) ordinary differential equations, A(t) = B(t) for all t. o 


Theorem 5.2. Let H denote the Heisenberg group and 6 its Lie algebra. Let G 
be a matrix Lie group with Lie algebra g and let @ : h —> g be a Lie algebra 
homomorphism. Then there exists a unique Lie group homomorphism ® : H —> G 
such that 


B(e*) = e% 


forall X €b. 


Proof. Recall (Exercise 18 in Chapter 3) that the Heisenberg group has the special 
property that its exponential map is one-to-one and onto. Let “log” denote the 
inverse of this map and define ® : H — G by the formula 


@ (A) = t08, 


We will show that ® is a group homomorphism. 

If X and Y are in the Lie algebra of the Heisenberg group (3 x 3 strictly upper 
triangular matrices), direct computation shows that every entry of [X,Y] is zero 
except possibly for the entry in the upper right corner. It is then easily seen that 
[X, Y] commutes with both X and Y. Since ¢ is a Lie algebra homomorphism, 
$ (X) and ¢ (Y) will also commute with their commutator. Thus, by Theorem 5.1, 
for any X and Y in the Lie algebra of the Heisenberg group, we have 
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o(e%e jeer) 
= eb +Y) lX) Y) 


— ef (X) et) 


II 


O(e*) P(e”). 


Thus, ® is a group homomorphism, which is continuous because each of exp, log, 
and @ is continuous. o 


5.3 The Baker-Campbell-Hausdorff Formula 


The goal of the Baker-Campbell—Hausdorff formula (BCH formula) is to compute 
log(e* e” ). One may well ask, “Why do we not simply expand both exponentials 
and the logarithm in power series and multiply everything out?” While it is possible 
to do this, what is not clear is why the answer is expressible in terms of commutators. 
Consider, for example, the quadratic term in the expression for log(e* e” ), which 
will be a linear combination of X2, Y?, XY, and YX. For this term to be expressible 
in terms of commutators, it must be a multiple of XY — YX. Although direct 
computation verifies that this is, indeed, the case, it is far from obvious how to 
prove that a similar result occurs for all the higher terms. 

We will actually state and prove an integral form of the BCH formula, rather 
than the series form (5.3). The integral version of the formula, along with the 
argument we present in Sect. 5.5, is actually due to Poincaré. (See [Poin1, Poin2] 
and Section 1.1.2.2 of [BF].) 

Consider the function 


logz 
[=t 


g(z) = 


which is defined and holomorphic in the disk {|z— 1| < 1}. Thus, g(z) can be 
expressed as a series 


gk) = x Am (Zz — 1)”, 


m=0 


with radius of convergence one. If V is a finite-dimensional vector space, we may 
identify V with C” by means of an arbitrary basis, so that the Hilbert-Schmidt norm 
(Definition 2.2) of a linear operator on V can be defined. For any operator A on V 
with || A — I || < 1, we can define 


114 5 The Baker-Campbell—Hausdorff Formula and Its Consequences 


g(A) = Yo an(A— 1)”. 


m=0 


We are now ready to state the integral form of the BCH formula. 


Theorem 5.3 (Baker—Campbell—-Hausdorff). For all n x n complex matrices X 
and Y with ||X || and || Y || sufficiently small, we have 


1 
log(eXe’) = X + f gle eY )(Y) dt. (5.8) 
0 


The proof of this theorem is given in Sect. 5.5 of this chapter. Note that e*¢* e4v 
and, hence, also g(e*** e'*4Y ) are linear operators on the space M,,(C) of all n x n 
complex matrices. In (5.8), this operator is being applied to the matrix Y. The fact 
that X and Y are assumed small guarantees that e*** e’*4” is close to the identity 
operator on M,,(C) for 0 < t < 1, so that g(ertx e'*4y) is well defined. Although 
the right-hand side of (5.8) is rather complicated to compute explicitly, we are not 
so much interested in the details of the formula but in the fact that it expresses 
log(e* e” ) (and hence e* e” ) in terms of the Lie-algebraic quantities ady and ady. 


5.4 The Derivative of the Exponential Map 


In this section we prove a result that is useful in its own right and will play a key 
role in our proof of the BCH formula. Consider the directional derivative of exp at 
a point X in the direction of Y: 


d 
ere! (5.9) 
dt 1=0 


Unless X and Y commute, this derivative may not equal e**'”Y. Nevertheless, 
since exp is continuously differentiable (Proposition 2.16), the directional deriva- 
tives in (5.9) will depend linearly on Y with X fixed. Now, the function (1 — e~*)/z 
is an entire function of z (with a removable singularity at the origin) and is given by 
the power series 


z loo} 


l-—e~ k 
= 2 cae 


which has infinite radius of convergence. It thus makes sense to replace z by an 
arbitrary linear operator A on a finite-dimensional vector space. 
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Theorem 5.4 (Derivative of Exponential). For all X,Y € M,,(C), we have 


d x+ = e* l S = o) 
dt =, ady 
gh [X.Y] | [XX] 
=e Ir- z1 + 31 eee, (5.10) 
More generally, if X (t) is a smooth matrix-valued function, then 
—ad y(r 
a eX = eX See ; (5.11) 
dt ad yi) dt 


Our proof follows [Tuy]. 


Lemma 5.5. If Z is a linear operator on a finite-dimensional vector space, then 


m=] = 


1 
lim — a) 5.12 
lim, Le ) 5 (5.12) 


Proof. If we formally applied the formula for the sum of a finite geometric series to 


e~2/™ we would get 
m-1 
1 1-e2 l-e7% 
= 2m) = Ss 
E Le Ye fae OSS = 
To give a rigorous argument, we observe that 
]—e-* 1 
— 1 e™ dt, 
x 0 
from which it follows that 
t= eZ 1 
y= 1 e dt. (5.13) 
0 


(The reader may check, using term-by-term integration of the series expansion of 
e~'7 that this formula for (1 — e~%)/Z agrees with our earlier definition.) 

Since (e~2/")* = e*4/™ the left-hand side of (5.12) is a Riemann-sum 
approximation to the matrix-valued integral on the right-hand side of (5.13). These 
Riemann sums converge to the integral of e~’“ —which is a continuous function of 
t—establishing (5.12). oO 
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Proof of Theorem 5.4. The formula (5.11) follows from (5.10) by applying the 
chain rule to the composition of exp and X(t). Thus, it suffices to prove (5.10). 
For any n x n matrices X and Y, set 


d 
A(X, Y)= Tet" 


t=0 


Since (Proposition 2.16) exp is a continuously differentiable map, A(X, Y ) is jointly 
continuous in X and Y and is linear in Y for each fixed X. 
Now, for every positive integer m, we have 


Y m 
ex tY — Ee + ~)| : (5.14) 


Applying the product rule, we will get m terms, where in each term, m — 1 of 
the factors in (5.14) are simply evaluated at £ = 0 and the remaining factor is 
differentiated at t = 0. Thus, 


m-l 


d X+1Y X/mym—k—-1 d X Y 
a = aad ieee oe ale 
dt. =0 LE ) dt \in = m 


J 


| (erie 


t=0 


m—1 


X Y 
(m—1)X/m X/m\—k X/m\k 
=e J e Af —, — ] (e 
2 ) (= S ( ) 


m—1 k 
1 d X 
— em—Y)X/m y ex{-“*) (až. r)) ; (5.15) 
nm k=0 a m 


In the third equality, we have used the linearity of A(X, Y ) in Y and the relationship 
between Ad and ad (Proposition 3.35). 

We now wish to let m tend to infinity in (5.15). The factor in front tends 
to exp(X). Since A(X,Y) is jointly continuous in X and Y, the expression 
A(X /m, Y) tends to A(0, Y), where it is easily verified that A(0, Y) = Y . Finally, 
applying Lemma 5.5 with Z = ady, we see that 


-1 : _ 

.. £2 ady\*  1—e7adx 
lim — Xex — = . 
m> m = m ady 


Thus, by letting m tend to infinity in (5.15), we obtain the desired result. o 
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5.5 Proof of the BCH Formula 


We now turn to the proof of Theorem 5.3. For sufficiently small X and Y in M,,(C), 
let 


Z(t) = loge“ e”) 


for 0 < t < 1. Our goal is to compute Z(1). Since e7 = e* e", we have 


oz zu = (eXe) e¥e”Y =Y. 
dt 
On the other hand, by Theorem 5.4, 


—adz(r 
e209 20 {ioe Nl (az) 
dt adz(t) 


Hence, 


I — e™™izo dZ 
) . m 
adzo) dt 
Now, if X and Y are small enough, Z(t) will also be small, so that [Z — 


edz] /adzit) will be close to the identity and thus invertible. In that case, we 
obtain 


(Y). (5.16) 


dZ | I — e720 


adz) 


Meanwhile, if we apply the homomorphism “Ad” to the equation e7® = e¥ e”, 
use the relationship between “Ad” and “ad,” and take a logarithm, we obtain the 
following relations: 


Ad, zi) = Adex Ade 
etzo ma e?! ptady 
adz) = log(e*** e"), 


Plugging the last two of these relations into (5.16) gives 


dZ I — (ex etr)! =1 
dt) log(esxetairy V (5.17) 
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Now, observe that 


1-z! 
ge) = i 
ogz 
so that (5.17) is the same as 
dZ : 
== gee). (5.18) 


Noting that Z(0) = X and integrating (5.18) gives 
1 
log(e*e’) = Z(1) = X +f gle e4) (Y) dt, 
0 


which is the Baker-Campbell—Hausdorff formula. 


5.6 The Series Form of the BCH Formula 


Let us see how to get the first few terms of the series form of the BCH formula from 
the integral form in Theorem 5.3. Using the Taylor series (2.7) for the logarithm, we 
may easily compute that 


gig) =1+ e- D- e-+ E-D- 


Meanwhile, 
e?! et ady —I 


(ady)? 
2 


t° (ady)? 
na reef 
7 = 
(ady)? t? (ady)? 

7 Te 7 Hee 


(1 +aax + to) (r +raar + 


= ady + tady + t ady ady + 


Since e®%xetadr — J has no zeroth-order term, (e*4* e“*4” — J)” will contribute 
only terms of degree m or higher in ady and/or ady. Computing up to degree 2 in 
ady and ady gives 


g (ex ef Ady ) 


(adx)? 4 on) 


1 
=I+-—{ ad tad t ady ad 
+5 (ax + ady + t adx ady + 7 7 
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1 
= [(adx)? + ¢?(ady)? + t ady ady + t ady adx | 


+ higher-order terms. 


We now apply g (e%dx efady ) to Y and integrate. Computing to second order and 
noting that any term with ady acting first is zero, we obtain: 


log(e*e’) 

! 1 1 1 t 
xxj [r + SY gN- z Y- E r Yn] dt 
AX+Y+ SKY+ XEY- E YN, 


which is the expression in (5.3). 


5.7 Group Versus Lie Algebra Homomorphisms 


Recall Theorem 3.28, which says that given matrix Lie groups G and H and a Lie 
group homomorphism ® : G — H, we can find a Lie algebra homomorphism 
$ : g > b such that ®(e*) = e%) for all X € g. In this section, we prove a 
converse to this result in the case that G is simply connected. 


Theorem 5.6. Let G and H be matrix Lie groups with Lie algebras g and 9, 
respectively, and let 6 : g —> h be a Lie algebra homomorphism. If G is simply 
connected, there exists a unique Lie group homomorphism ® : G — H such that 
(e¥) = ef) for all X € g. 


This result has the following corollary. 


Corollary 5.7. Suppose G and H are simply connected matrix Lie groups with Lie 
algebras g and b, respectively. If g is isomorphic to b, then G is isomorphic to H. 


Proof. Let ¢ : g —> b be a Lie algebra isomorphism. By Theorem 5.6, there exists 
an associated Lie group homomorphism © : G —> H. Since y := ~! is also 
a Lie algebra homomorphism, there is a corresponding Lie group homomorphism 
W: H — G. We want to show that ® and W are inverses of each other. 

Now, the Lie algebra homomorphism associated to ® o W is, by Proposition 3.30, 
equal to ¢ o w = I, and similarly for ¥ o ®. Thus, by Corollary 3.49, both ® o Y 
and Y o @ are equal to the identity maps on H and G, respectively. oO 


We now proceed with the proof of Theorem 5.6. The first step is to construct a 
“local homomorphism” from @. This step is the only place in the argument in which 
we use the BCH formula. 
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Definition 5.8. If G and H are matrix Lie groups, a local homomorphism of G 
to H is pair (U, f) where U is a path-connected neighborhood of the identity in G 
and f : U — H is a continuous map such that f (AB) = f(A) f(B) whenever A, 
B, and AB all belong to U. 


The definition says that f is as much of a homomorphism as it makes sense to 
be, given that U is not necessarily a subgroup of G. 


Proposition 5.9. Let G and H be matrix Lie groups with Lie algebras g and 6, 
respectively, and let @ : g > b be a Lie algebra homomorphism. Define U, C G by 
U: ={A€G|||A—J|| < 1 and |llog Al] < e}. 

Then there exists some ¢ > 0 such that the map f : U, > H given by 


f(A) = eb (log A) 


is a local homomorphism. 


Note that by Theorem 3.42, if ¢ is small enough, log A will be in g for all A € U,, 
so that ® makes sense. 


Proof. Choose £ small enough that Theorem 3.42 applies and small enough that for 
all A, B € U., the BCH formula applies to X := log A and Y := log B and also to 
o(X) and ¢(Y). Then if AB happens to be in Us, we have 


F(AB) = f(e e”) = t0), 
We now compute log(e* e” ) by the BCH formula and then apply ¢. Since ¢ is a Lie 
algebra homomorphism, it will change all the Lie-algebraic quantities involving X 


and Y in the BCH formula into the analogous quantities involving @(X) and ġ (Y). 
Thus, as in (5.4), we have 


170 
pllog(e*e”)] = $(X) + f Y amle hoew — 1)" (B(Y)) dt 
0 


m=0 


= loge? eo) 
We obtain, then, 


F(AB) = exp flog(e?e?)} 


= p(X) bY) 
= f(A) f(B), 


as claimed. oO 
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Theorem 5.10. Let G and H be matrix Lie groups, with G simply connected. If 
(U, f) is a local homomorphism of G into H, there exists a unique (global) Lie 
group homomorphism ® : G —> H such that ® agrees with f on U. 


Proof. Step 1: Define ® along a path. Since G is simply connected and thus 
connected, for any A € G, there exists a path A(t) € G with A(O) = J and 
A(1) = A. Let us call a partition 0 = tọ < tı < t)--- < tm = 1 of [0, 1] a good 
partition if for all s and ¢ belonging to the same subinterval of the partition, we 
have 


A(t)A(s)7! € U. (5.19) 


Lemma 3.48 guarantees that good partitions exist. If a partition is good, then, 
in particular, since fọ = 0 and A(0) = J, we have A(t,) € U. Choose a good 
partition and write A as 


A = [AA tm- D Atn) A(tm—2) 4] + [Aa A(t) A(t). 


Since ® is supposed to be a homomorphism and is supposed to agree with f 
near the identity it is reasonable to “define” ®(A) by 


P(A) = FADA tm) ') +++ f(A) ACH) FAN). (5.20) 


In the next two steps, we will prove that ®(A) is independent of the choice of 
partition for a fixed path and independent of the choice of path. 

Step 2: Prove independence of the partition. For any good partition, if we insert 
an extra partition point s between t; and t;+ , the result is easily seen to be 
another good partition. This change in the partition has the effect of replacing the 
factor f(A(t;41)A(t;)~!) in (5.20) by 


FAGEDA D FAA’). 


Since s is between f; and t;+1, the condition (5.19) on the original partition 
guarantees that A(tj;+1)A(s)~!, A(s)A(t;)~! and A(t;41)A(t;)~! are all in U. 
Thus, since f is a local homomorphism, we have 


FAGDAG D = SAG DAG) DFAA, 


showing that the value of ®(A) is unchanged by the addition of the extra partition 
point. 

By repeating this argument, we see that the value of ®(A) does not change by 
the addition of any finite number of points to the partition. Now, any two good 
partitions have a common refinement, namely their union, which is also a good 
partition. The above argument shows that the value of ®(A) computed from the 
first partition is the same as for the common refinement, which is the same as for 
the second partition. 
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Step 3: Prove independence of the path. It is in this step that we use the 
simple connectedness of G. Suppose Ao(t) and A(t) are two paths joining 
the identity to some A € G. Then, since G is simply connected, a standard 
topological argument (e.g., Proposition 1.6 in [Hat]) shows that Ap and A; are 
homotopic with endpoints fixed. This means that there exists a continuous map 
A: [0,1] x [0, 1] > G with 


for all t € [0, 1] and also 
A(s,0)=T1, A(s,1)=A 


for all s € [0, 1]. 
As in the proof of Lemma 3.48, there exists an integer N such that for all (s, t) 
and (s’, t’) in [0, 1] x [0, 1] with |s — s’| < 2/N and |t — t'| < 2/N, we have 


A(s, A(s’, t)! € U. 


We now deform Ap “a little bit at a time” into A,. This means that we define a 
sequence B; of paths, with k = 0,...,N —1and/ = 0,...,N. We define 
these paths so that Bx (t) coincides with A((k + 1)/N, t) for t between 0 and 
(l — 1)/N, and By; (t) coincides with A(k/N, t) for t between //N and 1. For 
t between (J — 1)/N and 1/N, we define B,.;(t) to coincide with the values 
of A(-,-) on the path that goes “diagonally” in the (s, t)-plane, as indicated in 
Figure 5.1. When / = 0, there are no t-values between 0 and (/ — 1)/N, so 
Byo(t) = A(k/N,t) for all t € [0, 1]. In particular, Boo (t) = Ao(t). 


Fig. 5.1 The path in the 
(s, t) plane defining Bx. (t) 
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We now deform Ap = Boo into Bo, and then into Bo2, Bo3, and so on until we 
reach Boy, which we then deform into B1, and so on until we reach By_1,y, 
which we finally deform into A;. We claim that the value of ®(A) is the same at 
each stage of this deformation. Note that for k < l, By; (t) and Bk 1+1(t) are the 
same except for ¢’s in the interval 


IC- D/N, E+ D/N]. 


Furthermore, by Step 2, we are free to choose any good partition we like to 
compute ®(A). For both Bz; and Bk 1+1, we choose the partition points to be 


1 l—1 7/+1 1742 


E ; TEER I 
N N N N 


which gives a good partition by the way N was chosen. 

Now, from (5.20), the value of ®(A) depends only on the values of the path at the 
partition points. Since we have chosen our partition in such a way that the values 
of By; and By 74; are identical at all the partition points, the value of ®(A) is 
the same for these two paths. (See Figure 5.2.) A similar argument shows that 
the value of P(A) computed along Bx, y is the same as along By +10. Thus, the 
value of ®(A) is the same for each path from Ag = Boo all the way to By_1.n 
and then the same as A}. 


Fig. 5.2 The paths B,; and 
Bk ı+1ı agree at each partition 
point. In the figure, s 
increases as we move from 
the top toward the bottom 
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Step 4: Prove that ® is a homomorphism and agrees with f on U . The proof that 
® is a homomorphism is a straightforward unpacking of the definition of ® and 
is left to the reader; see Exercise 7. To show that ® agrees with f on U, choose 
A e U. Since U is path connected, we can find a path A(t), 0 < t < 1, lying in 
U joining 7 to A. Choose a good partition {t; j=, for A(t), and we then claim 
that for all j, we have ®(A(t;)) = f(A(t;)). 

Note that A(t), O < t < t; is a path joining J to A(¢;) and that {f,%,...,¢;} is 
a good partition of this path. (Technically, we should reparameterize this path so 
that the time interval is [0, 1].) Hence, 


PAD = FACA- = f(A) A(H)) f(A), 
for all 7. In particular, 


(A(t) = f(A(n)). 


Now assume that ®(A(t;)) = f(A(¢;)), and compute that 


®(A(ti41)) = FAGDAG) D SACA -DD = (AG) 
= f(A DAG) ')O(A(t;)) 
= f(AG AG) )S(AG)) 
= f(A@;)). 


The last equality holds because f is a local homomorphism and because 
A(tj41)A(t;)~!, A(t;), and their product all lie in U. Thus, by induction, 
@(A(t;)) = f(A(t;)) for all j; when j = m, we obtain ®(A) = f(A). o 


It is important to note that when proving independence of the path (Step 3 of 
the proof), it is essential to know already that the value of ®(A) is independent of 
the choice of good partition (Step 2 of the proof). Specifically, when we move from 
By; to Bei+1, we use one partition for By, 741, but when we move from Bk +1 to 
Bk 1+2, we use a different partition for B 7+). Essentially, the proof proceeds by 
deforming the path between partition points—which clearly does not change the 
value of ®(A)—then picking a new partition and doing the same thing again. Note 
also that it is in proving independence of the partition that we use the assumption 
that f is a local homomorphism. 


Proof of Theorem 5.6. For the existence part of the proof, let f be the local 
homomorphism in Proposition 5.9 and let ® be the global homomorphism in 
Theorem 5.10. Then for any X € g, the element e*/” will be in U for all sufficiently 
large m, showing that 


@(e*/") = te") = eb (X)/m_ 
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Since ® is a homomorphism, we have 
O(e*) = P(e” m" = 6%, 


as required. 
For the uniqueness part of the proof, suppose ®; and ® are two homomorphisms 


related in the desired way to ¢. Then for any A € G, we express A as e*! ---e* 
with X; € g, as in Corollary 3.47, and observe that 

(A) = P(A) = ef 2)... et AN) 
so that ©; agrees with ®p. oO 


We conclude this section with a typical application of Theorem 5.6. 


Theorem 5.11. Suppose that G is a simply connected matrix Lie group and that 
the Lie algebra g of G decomposes as a Lie algebra direct sum g = bı ® ho, for two 
subalgebras bı and b2 of g. Then there exists closed, simply connected subgroups 
A, and H; of G whose Lie algebras are hı and b, respectively. Furthermore, G is 
isomorphic to the direct product of Hı and H3. 


Proof. Consider the Lie algebra homomorphism ġ : g —> g that sends X + Y to 
X, where X € bh; and Y € hp. Since G is simply connected, there is a Lie group 
homomorphism ® : G > G associated to ¢, and the Lie algebra of the kernel of ® 
is the kernel of @ (Proposition 3.31), which is b2. Let H3 be the identity component 
of ker ®. Since © is continuous, ker ® is closed, and so is its identity component H 
(Corollary 3.52). Thus, H is a closed, connected subgroup of G with Lie algebra 
h2. By a similar argument, we may construct a closed, connected subgroup H, of G 
whose Lie algebra is b1. 

Suppose now that A(t) is a loop in H;. Since G is simply connected, there is 
a homotopy A(s,t) shrinking A(t) to a point in G. Now, @ is the identity on hy, 
from which it follows that ® is the identity on Hı. Thus, if we define B(s,t) = 
(A(s, t)), we see that 


B(0,t) = ®(A(t)) = A(t). 


Furthermore, since ¢ maps G into hy, we see that ® maps G into Hı. We conclude 
that B is a homotopy of A(t) to a point in Hı. Thus, H; is simply connected, and, 
by a similar argument, so is H3. 

Finally, since g is the Lie algebra direct sum of bı and ho, elements of bı 
commutes with elements of b2. It follows that elements of H, (which are all product 
of exponentials of elements of h;) commute with elements of H2. Thus, we have a 
Lie group homomorphism Y : Hı x Hy —> G given by W(A, B) = AB. The 
associated Lie algebra homomorphism y is then just the original isomorphism of 
hi ® h2 with g. Since G is simply connected, there is a homomorphism T : G > 
H; x Hy for which the associated Lie algebra homomorphism is y~!. By the proof 
of Corollary 5.7, T and W are inverses of each other, showing that G is isomorphic 
to H 1X H 2. oO 
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5.8 Universal Covers 


Theorem 5.6 says that if G is simply connected, every homomorphism of the Lie 
algebra g of G can be exponentiated to a homomorphism of G. If G is not simply 
connected, we may look for another group G that has the same Lie algebra as G but 
such that G is simply connected. 


Definition 5.12. Let G be a connected matrix Lie group. Then a universal cover 
of G is a simply connected matrix Lie group H together with a Lie group 
homomorphism ® : H — G such that the associated Lie algebra homomorphism 
gd : b — gis a Lie algebra isomorphism. The homomorphism ® is called the 
covering map. 


If a universal cover of G exists, it is unique up to “canonical isomorphism,” as 
follows. 


Proposition 5.13. If G is a connected matrix Lie group and (H,, ®,) and (H3, ®2) 
are universal covers of G, then there exists a Lie group isomorphism Y : H, —> H» 
such that ®, o Y = QD. 


Proof. See Exercise 9. oO 


Since a connected matrix Lie group has at most one universal cover (up to 
canonical isomorphism), it is reasonable to speak of the universal cover (G, ®) of 
G. Furthermore, if H is a simply connected Lie group and ¢ : h > g is a Lie 
algebra isomorphism, then by Theorem 5.6, we can construct an associated Lie 
group homomorphism ® : H — G, so that (H, ®) is a universal cover of G. 
Since ¢ is an isomorphism, we can use ¢ to identify g with g. Thus, in slightly less 
formal terms, we may define the notion of universal cover as follows: The universal 
cover of a matrix Lie group G is a simply connected matrix Lie group G such that 
the Lie algebra of G is equal to the Lie algebra of G. With this perspective, we 
have the following immediate corollary of Theorem 5.6. 


Corollary 5.14. Let G be a connected matrix Lie group and let G be the universal 
cover of G, where we think of G and G as having the same Lie algebra g. If H is a 
matrix Lie group with Lie algebra h and ọ : g — is a Lie algebra homomorphism, 
there exists a unique homomorphism ® : G —> H such that ®(e*) = ce?) for all 
X €g. 


An example of importance in physics is the universal cover of SO(3). 
Example 5.15. The universal cover of SO(3) is SU(2). 


Proof. The group SU(2) is simply connected by Proposition 1.15. Proposition 1.19 
and Example 3.29 then provide the desired covering map. Oo 


The topic of universal covers is one place where we pay a price for our decision 
to consider only matrix Lie groups: a matrix Lie group may not have a universal 
cover that is a matrix Lie group. It is not hard to show that every Lie group G 
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has a universal cover in the class of (not necessarily matrix) Lie groups. Indeed, 
G has a universal cover in the topological sense (Definition 13.1), and this cover 
can be given a group structure in such a way that the covering map is a Lie group 
homomorphism. It turns out, however, that the universal cover of a matrix Lie group 
may not be a matrix group. 

We now show that the group SL(2; R) does not have a universal cover in the class 
of matrix Lie groups. We begin by showing that SL(2; R) is not simply connected. 
By Theorem 2.17 and Proposition 2.19, SL(2; R), as a manifold, is homeomorphic 
to SO(2) x V, where V is the space of 2 x 2 real, symmetric matrices with trace 
zero. Now, V, being a vector space, is certainly simply connected, but SO(2), which 
is homeomorphic to the unit circle S!, is not. Thus, SL(2; R) itself is not simply 
connected. By contrast, the group SL(2; C) decomposes as SU(2) x W, where W is 
the space of 2 x 2 self-adjoint, complex matrices with trace zero. Since both SU(2) 
and W are simply connected, SL(2; C) is simply connected. (See Appendix 13.3 for 
more information on this type of calculation.) 


Proposition 5.16. Let G C GL(n;C) be a connected matrix Lie group with Lie 
algebra g. Suppose ® : G — SL(2;R) is a Lie group homomorphism for which 
the associated Lie algebra ¢ : g — Sl(2;R) is a Lie algebra isomorphism. Then 
® is a Lie group isomorphism and, therefore, G cannot be simply connected. Thus, 
SL(2; R) has no universal cover in the class of matrix Lie groups. 


The result relies essentially on the assumption that G is a matrix Lie group—or, 
more precisely, on the assumption that G is contained in a group whose Lie algebra 
is complex. 


Lemma 5.17. Suppose Y : sl(2;R) —> gl(n;C) is a Lie algebra homomorphism. 
Then there exists a Lie group homomorphism ® : SL(2; R) — GL(n; C) such that 
(eX) = e? for all X € sl(2;R). 


The significance of the lemma is that the result holds even though SL(2; R) is 
not simply connected. 


Proof. Let Ye : sl(2;C) — gl(n;C) be the complex-linear extension of y to 
sl(2;C) = sl(2;R)c, which is a Lie algebra homomorphism (Proposition 3.39). 
Since SL(2;C) is simply connected, there exists a Lie group homomorphism 
We : SL(2;C) —> GL(n; C) such that Ye(e¥) = eY% for all X € sl(2;C). If we 
let Y be the restriction of Ye to SL(2; R), then Y is a Lie group homomorphism 
which satisfies V(e*) = e¥ for X € sl(2;R). o 


Proof of Proposition 5.16. Since ġ is a Lie algebra isomorphism, the inverse map 
w : sl(2;IR) —> g is a Lie algebra homomorphism. Thus, by the lemma, there is a 
Lie group homomorphism Y : SL(2; R) —> G corresponding to y. Since ¢ and y 
are inverses of each other, it follows from Proposition 3.30 and Corollary 3.49 that 
® and W are also inverses of each other. o 


128 5 The Baker-Campbell—Hausdorff Formula and Its Consequences 
5.9 Subgroups and Subalgebras 


In this section, we address Question 3 from Sect. 5.1: If G is a matrix Lie group with 
Lie algebra g and h is a subalgebra of g, does there exist a matrix Lie group H C G 
whose Lie algebra is H? If the exponential map for G were a homeomorphism 
between g and G and if the BCH formula worked globally instead of locally, the 
answer would be yes, since we could simply define H to be the set of elements of 
the form e*, X € g, and the BCH formula would show that H is a subgroup. 

In reality, the answer to Question 3, as stated, is no. Suppose, for example, that 


G = GL(2;C) and 
a 


where a is irrational. If there is going to be a matrix Lie group H with Lie algebra 
b, then H would have to contain the closure of the group 


e" 0 
H= : 
i | ( 0 2) 


which is (Exercise 10 in Chapter 1) is the group 


H, = es jo) [eoer} ; 


But then the Lie algebra of H would have to contain the Lie algebra of Hı, which 
is two dimensional! 

Fortunately, all is not lost. We can still get a subgroup H for each subalgebra 
if we weaken the condition that H be a matrix Lie group. In the above example, the 
subgroup we want is Ho, despite the fact that Ho is not closed. 


te RÌ , (5.22) 


Definition 5.18. If H is any subgroup of GL (n; C), the Lie algebra b of H is the 
set of all matrices X such that 


eXeH 


for all real t. 


It is possible to prove that for any subgroup H of GL(n; C), the Lie algebra h of 
H is actually a Lie algebra, that is, a real vector space—possibly zero dimensional— 
and closed under brackets. (See Proposition 1 and Corollary 7 in Chapter 2 of 
[Ross].) This result is not, however, directly relevant to our goal in this section, 
which is to construct, for each subalgebra h of gl(n; C) a subgroup with Lie algebra 
b. Note, however, that if h is at least a real subspace of gl(n; C), then the proof of 
Point 4 of Theorem 3.20 shows that is also closed under brackets. 
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Definition 5.19. If G is a matrix Lie group with Lie algebra g, then H C G is a 
connected Lie subgroup of G if the following conditions are satisfied: 


1. H isa subgroup of G. 

2. The Lie algebra h of H is a Lie subalgebra of g. 

3. Every element of H can be written in the form e 
X1,...,Xm ED. 


X1 pX2... 


e ex, with 


Connected Lie subgroups are also called analytic subgroups. Note that any 
group H as in the definition is path connected, since each element of H can be 
connected to the identity in H by a path of the form 

tps el 91 ei) X2 12. pI) Xm 
The group Hp in (5.22) is a connected Lie subgroup of GL(2; C) whose Lie algebra 
is the algebra b in (5.21). 

We are now ready to state the main result of this section, which is our second 

major application of the Baker-Campbell—Hausdorff formula. 


Theorem 5.20. Let G be a matrix Lie group with Lie algebra g and let h be a Lie 
subalgebra of g. Then there exists a unique connected Lie subgroup H of G with 
Lie algebra b. 


If h is the subalgebra of gl(2; C) in (5.21), then the connected Lie subgroup H 
is the group Hp in (5.22), which is not closed. In practice, Theorem 5.20 is most 
useful in those cases where the connected Lie subgroup H turns out to be closed. 
See Proposition 5.24 and Exercises 10, 13, and 14 for conditions under which this 
is the case. 

We now begin working toward the proof of Theorem 5.20. Since G is assumed 
to be a matrix Lie group, we may as well assume that G = GL(n; C). After all, if G 
is a closed subgroup of GL(n; C) and H is a connected Lie subgroup of GL(; C) 
whose Lie algebra h is contained in g, then H is also a connected Lie subgroup 
of G. We now let 


H = {e%e%...e*"| X1,...,Xy € b}, (5.23) 


which is a subgroup of G. The key issue is to prove that the Lie algebra of H, 
in the sense of Definition 5.18, is h. Once we know that Lie(H) = b, we will 
immediately conclude that H is a connected Lie subgroup with Lie algebra b, the 
remaining properties in Definition 5.19 being true by definition. Note that for the 
claim Lie(H) = b to be true, it essential that h be a subalgebra of gl(n; C), and not 
merely a subspace; compare Exercise 11. 

As in the proof of Theorem 3.42, we think of gl(n; C) as R”? and we decompose 
gl(n; C) as the direct sum of h and D, where D is the orthogonal complement of 
h with respect to the usual inner product on R2"’. Then, as shown in the proof 
of Theorem 3.42, there exist neighborhoods U and V of the origin in bh and D, 
respectively, and a neighborhood W of J in GL(n; C) with the following properties: 
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Each A € W can be written uniquely as 
A=ere’, XEU,YEYV, (5.24) 


in such a way that X and Y depend continuously on A. We think of the 
decomposition in (5.24) as our local coordinates in a neighborhood of the identity 
in GL(n; C). 

If X is a small element of h, the decomposition of e* is just e* e°. If we take the 
product of two elements of the form eX1e%2 with X, and X> small elements of bh, 
then since h is a subalgebra, if we combine the exponentials as e*!e*? = e% by 
means of the Baker-Campbell—Hausdorff formula, X3 will again be in h. Thus, if 
we take a small number of products as in (5.23) with the X ;’s being small elements 
of h, we will move from the identity in the X -direction in the decomposition (5.24). 
Globally, however, H may wind around and come back to points in W of the 
form (5.24) with Y 4 0. (See Figure 5.3.) Indeed, as the example of the “irrational 
line” in (5.22) shows, there may be elements of H in W with arbitrarily small 
nonzero values of Y. Nevertheless, we will see that the set of Y values that occurs 
is at most countable. 


Lemma 5.21. Decompose gl(n;C) as hb ® D and let V be a neighborhood of the 
origin in D as in (5.24). If E C V is defined by 


E={Y eV |e” €H}, 


then E is at most countable. 


Assuming the lemma, we may now prove Theorem 5.20. 


U 


Fig. 5.3 The black lines indicate the portion of H in the set W. The group H intersects e” in at 
most countably many points 
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Proof of Theorem 5.20. As we have already observed, it suffices to show that the 
Lie algebra of H is h. Let b’ be the Lie algebra of H, which clearly contains b. For 
Z € b', we may write, for all sufficiently small r, 


eZ = eXW eV). 


where X(t) € U C hand Y(t) € V C D and where X(t) and Y(t) are continuous 
functions of t. Since Z is in the Lie algebra of H , we have e'” € H forall t. Since, 
also, eX is in the group H, we conclude that e”® is in H for all sufficiently 
small t. If Y(t) were not constant, then it would take on uncountably many values, 
which would mean that E is uncountable, violating Lemma 5.21. So, Y(t) must be 
constant, and since Y(0) = 0, this means that Y(t) is identically equal to zero. Thus, 
for small t, we have e'% = e* and, therefore, tZ = X(t) € b. This means Z € b 
and we conclude that b’ C b. o 


Before proving Lemma 5.21, we prove another lemma. 


Lemma 5.22. Pick a basis for h and call an element of h rational if its coefficients 
with respect to this basis are rational. Then for every 6 > 0 and every A € H, there 
exist rational elements R,,..., Rm of b such that 


A= eFiek2...eRmeX 


where X is inh and ||X || < ô. 


Suppose we take ô small enough that the ball of radius ô in h is contained in U. 
Then since there are only countably many m-tuples of the form (Rj,..., Rm) with 
R; rational, the lemma tells us that H can be covered by countably many translates 
of the set e”. 


Proof. Choose € > 0 so that for all X,Y € h with ||X|| < £ and ||Y|| < £, the 
Baker—Campbell—Hausdorff holds for X and Y. Let C(-,-) denote the right-hand 
side of the formula, so that 


gt pCO 


whenever || X ||, || ¥ || < £. It is not hard to see that C (-, -) is a continuous. Now, if the 
lemma holds for some 6, it also holds for any 6’ > 6. Thus, it is harmless to assume 


ô is less than ¢ and small enough that if || X || , || Y || < 6, we have ||C(X, Y)|| < e. 
Since e* = (e*/*)*, every element A of H can be written as 


A=eX!...e%N (5.25) 
with X; € 6 and |X; | < 6. We now proceed by induction on N. If N = 0, then 
A 


= I = e?, and there is nothing to prove. Assume the lemma for A’s that can be 
expressed as in (5.25) for some integer N , and consider A of the form 


132 5 The Baker-Campbell—Hausdorff Formula and Its Consequences 


A = e%!...e*N eXNt1 (5.26) 


with X; € h and | Xj | < 6. Applying our induction hypothesis to e*! ---e*", we 
obtain 


A = ek ee m @X oXN+1 


Ri |.. ePm CX X41) 


m e 


where the R;’s are rational and ||C(X, Xv+1)|| < £. Since § is a subalgebra of 
gl(n; C), the element C (X, Xy +1) is again in h, but may not have norm less than 6. 

Now choose a rational element Rm+1 of b that is very close to C(X, Xy+ ) and 
such that || Rm+1l| < £. We then have 


A = e! ... ePm ePm+i e7 Bm+ p(X. Xn 41) 


=el... e. Rn oX” 


-e me 


’ 


where 
X’ = C(—Rm+1, C(X, X41). 


Then X’ will be in b, and by choosing Rm+1 sufficiently close to C(X, Xn +1), we 
can make || X’|| < 6. After all, since C(—Z, Z) = log(e~“e”) = 0 for all small Z, 
if Z’ is close to Z, then C(—Z’, Z) will be small. Oo 


We now supply the proof of Lemma 5.21. 


Proof of Lemma 5.21. Fix 6 so that for all X and Y with ||X||,||Y|| < ô, the 
quantity C(X,Y) is defined and contained in U. We then claim that for each 
sequence R,,..., Rm of rational elements in h, there is at most one X e€ h with 
|| X || < 6 such that the element 


ek eka... eR eX (5.27) 


belongs to e”. After all, if we have 


efle... ePmnežl = e”, (5.28) 
ele... ePne® = e” (5.29) 
with Y1, Yo € V, then 
eXe% = eie% 


and so 


et! = eX eX2e7-h — eC CXX) 9 ¥2 | 
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with C(—X,, X2) € U. However, each element of eve’ hasa unique representation 
as e’ e* with X € U and Y € V. Thus, we must have —Y, = —Y and, by (5.28) 
and (5.29), e*! = e*2 and X, = X2. 

By Lemma 5.22, every element of H can be expressed in the form (5.27) with 
|| X || < 6. Now, there are only countably many rational elements in h and thus only 
countably many expressions of the form e®! - - - e?r , each of which produces at most 
one element of the form (5.27) that belongs to eV . Thus, the set E in Lemma 5.21 
is at most countable. o 


This completes the proof of Theorem 5.20. 

If a connected Lie subgroup H of GL(n;C) is not closed, the topology H 
inherits from GL(n; C) may be pathological, e.g., not locally connected. (Compare 
Figure 1.1.) Nevertheless, we can give H a new topology that is much nicer. 


Theorem 5.23. Let H be a connected Lie subgroup of GL(n; C) with Lie algebra b. 
Then H can be given the structure of a smooth manifold in such a way that the group 
operations on H are smooth and the inclusion map of H into GL(n; C) is smooth. 


Thus, every connected Lie subgroup of GL(n; C) can be made into a Lie group. In 
the case of the group Ho in (5.22), the new topology on Ho is obtained by identifying 
Ho with R by means of the parameter t in the definition of Ho. 


Proof. For any A € H and any e > 0, define 
Uae = {Ae*| X € b and |X| < £}. 


Now define a topology on H as follows: A set U C H is open if for each A € U 
there exists € > 0 such that Uy. C U. (See Figure 5.4.) In this topology, two 
elements A and B of H are “close” if we can express B as B = Ae* with X € b 
and ||X || small. This topology is finer than the topology H inherits from G; that is, 
if A and B are close in this new topology, they are certainly close in the ordinary 
sense in G, but not vice versa. 

It is easy to check that this topology is Hausdorff, and using Lemma 5.22, it is 
not hard to see that the topology is second countable. Furthermore, in this topology, 
H is locally homeomorphic to RY, where N = dimh, by identifying each U4, 
with the ball of radius € in b. 

We may define a smooth structure on H by using the U4,’°’s, with £ less than some 
small number £ọ, as our “atlas.” If two of these sets overlap, then some element C of 
H canbe written as C = Ae¥ = Be” for some A, B € H and X,Y € h. It follows 
that B = Ae*e~’, which means (since ||X || and ||Y || are less than £o) that A and 
B are close. The change-of-coordinates map is then Y = log(BT!Ae*). Since A 
and B are close and ||X || is small, we will have that l B-'Ae* —I l < 1, so that 
B~'Ae* is in the domain where the matrix logarithm is defined and smooth. Thus, 
the change-of-coordinates map is smooth as function of X. Finally, in any of the 
coordinate neighborhoods U4., the inclusion of H into G is given by X => Ae* ; 
which is smooth as a function of X. o 
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Fig. 5.4 The set U in H is 0) 
open the new topology but not 2n 


in the topology inherited from / 
GL(2; C). The element B is 
close to A in GL(2; C) but 
not in the new topology on H 
y 
A 9 
2m 


As we have already noted, Theorem 5.20 is most useful in cases where the 
connected Lie subgroup H is actually closed. The following result gives one 
condition under which this is guaranteed to be the case. See also Exercises 10, 13, 
and 14. 


Proposition 5.24. Suppose G C GL(n; C) is a matrix Lie group with Lie algebra 
g and that h is a maximal commutative subalgebra of g, meaning that h is 
commutative and h is not contained in any larger commutative subalgebra of g. 
Then the connected Lie subgroup H of G with Lie algebra 6 is closed. 


Proof. Since h is commutative, H is also commutative, since every element of H 
is a product of exponentials of elements of h. It easily follows that the closure H of 
H in GL(n; ©) is also commutative. We now claim that H is connected. To see this, 
take A € H, so that A is in G (since G is closed) and A is the limit of a sequence 
Am in H. Since H is closed, Theorem 3.42 applies. Thus, for all sufficiently large 
m, the element AA; is expressible as AA;,'! = e*, for some X in the Lie algebra b’ 
of H. Thus, A = e* Am, which means that A can be connected to Ám by the path 
A(t) = el-OX 40 < t < 1, in A. Since Am can be connected to the identity in 
H C H, we see that A can be connected to the identity in A. 

Now, since H is commutative, its Lie algebra ’ is also commutative. But since 
h was maximal commutative, we must have h’ = b. Since, also, H is connected, we 
conclude that H = H , Showing that H is closed. oO 
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5.10 Lie’s Third Theorem 


Lie’s third theorem (in its modern, global form) says that for every finite- 
dimensional, real Lie algebra g, there exists a Lie group G with Lie algebra g. 
We will construct G as a connected Lie subgroup of GL(n; C). 


Theorem 5.25. If g is any finite-dimensional, real Lie algebra, there exists a 
connected Lie subgroup G of GL(n; C) whose Lie algebra is isomorphic to g. 


Our proof assumes Ado’s theorem, which asserts that every finite-dimensional 
real or complex Lie algebra is isomorphic to an algebra of matrices. (See, for 
example, Theorem 3.17.7 in [Var].) 


Proof. By Ado’s theorem, we may identify g with a real subalgebra of gl(n; C). 
Then by Theorem 5.20, there is a connected Lie subgroup of GL(n;C) with Lie 
algebra g. Oo 


It is actually possible to choose the subgroup G in Theorem 5.25 to be closed. 
Indeed, according to Theorem 9 on p. 105 of [Got], if a connected Lie group G 
can be embedded into some GL(n; C) as a connected Lie subgroup, then G can be 
embedded into some other GL(n’; C) as a closed subgroup. Assuming this result, 
we may reach the following conclusion. 


Conclusion 5.26. Every finite-dimensional, real Lie algebra is isomorphic to the 
Lie algebra of some matrix Lie group. 


This result does not, however, mean that every Lie group is isomorphic to a 
matrix Lie group, since there can be several nonisomorphic Lie groups with the 
same Lie algebra. See, for example, Sect. 4.8. 


5.11 Exercises 


1. Let X be a linear transformation on a finite-dimensional real or complex vector 
space. Show that 


I-—e* 
X 


is invertible if and only if none of the eigenvalues of X (over C) is of the form 
2xin, with n an nonzero integer. 


Remark. This exercise, combined with the formula in Theorem 5.4, gives the 
following result (in the language of differentiable manifolds): The exponential 
map exp : g —> G is a local diffeomorphism near X € g if and only if ady : 
g — g has no eigenvalue of the form 2zrin, with n a nonzero integer. 
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. Show that for any X and Y in M,(C), even if X and Y do not commute, 


d 
—trace(e* +”) = trace(e* Y). 
dt 1=0 


. Compute log(e*e’) through third order in X and Y by calculating directly 


with the power series for the exponential and the logarithm. Show this gives the 
same answer as the Baker-Campbell—Hausdorff formula. 


. Suppose that X and Y are upper triangular matrices with zeros on the diagonal. 


Show that the power series for log(e* e” ) is convergent. What happens to the 
series form of the Baker-Campbell—Hausdorff formula in this case? 


. Suppose X and Y aren xn complex matrices satisfying [X, Y] = aY for some 


complex number œ. Suppose further that there is no nonzero integer n such that 
a = 2zin. Show that 


e¥e” = exp{X + Z y: 
L= 


Hint: Let A(t) = e¥ e” and let 


B(t) = exp [x + E i 
1—e™ 


Using Theorem 5.4, show that A(t) and B(t) satisfy the same differential 
equation with the same value at t = 0. 


. Give an example of matrices X and Y in sl(2;C) such that [X,Y] = 27iY 


but such that there does not exist any Z in sl(2;C) with e*e” = e7. Use 
Example 3.41 and compare Exercise 5. 


. Complete Step 4 in the proof of Theorem 5.6 by showing that ® is a 


homomorphism. For all A, B € G, choose a path A(t) connecting J to A and 
a path B(t) connecting J to B. Then define a path C connecting J to AB by 
setting C(t) = B(2t) for0 < t < 1/2 and setting C(t) = A(2t — 1)B for 
1/2 < t < 1. If to, ... , fm is a good partition for A(t) and so,..., Sy is a good 
partition for B(t), show that 


So sm 1l to 1 tm 
rr aa) ee 277 


’ 


is a good partition for C(t). Now, compute ®(A), ®(B), and ®(AB) using these 
paths and partitions and show that ®(AB) = ®(A)®(B). 


. If G is a universal cover of a connected group G with projection map ®, show 


that ® maps G onto G. 


. Prove the uniqueness of the universal cover, as stated in Proposition 5.13. 
. Let a be a subalgebra of the Lie algebra of the Heisenberg group. Show that 


exp(a) is a connected Lie subgroup of the Heisenberg group and that this 
subgroup is closed. 


5.11 


11. 


12. 


13. 


14. 
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Consider the Lie algebra h of the Heisenberg group H, as computed in 
Proposition 3.26. Let X, Y , and Z be the basis elements for § in (4.18), which 
satisfy [X,Y] = Z and [X, Z] = [Y, Z] = 0. Let V be the subspace of b 
spanned by X and Y (which is not a subalgebra of h) and let K denote the 
subgroup of H consisting of products of exponential of elements of V. Show 
that K = H and, thus, that the Lie algebra of K is not equal to V. 

Hint: Use Theorem 5.1 and the surjectivity of the exponential map for H 
(Exercise 18 in Chapter 3). 

Show that every connected Lie subgroup of SU(2) is closed. Show that this is 
not the case for SU(3). 

Let G be a matrix Lie group with Lie algebra g, let h be a subalgebra of g, and 
let H be the unique connected Lie subgroup of G with Lie algebra h. Suppose 
that there exists a simply connected, compact matrix Lie group K such that the 
Lie algebra of K is isomorphic to h. Show that H is closed. Is H necessarily 
isomorphic to K? 

This exercise asks you to prove, assuming Ado’s theorem (Sect. 5.10), the 
following result: If G is a simply connected matrix Lie group with Lie algebra 
g and b is an ideal in g, then the connected Lie subgroup H with Lie algebra h 
is closed. 


(a) Show that there exists a Lie algebra homomorphism ¢ : g —> gI(N; ©) 
with ker(@) = b. 

Hint: Since h is an ideal in g, the quotient space g/b has a natural Lie 
algebra structure. 

Since G is simply connected, there exists a Lie group homomorphism ® : 
G — gl(N;C) for which the associated Lie algebra homomorphism is ¢. 
Show that the identity component of the kernel of ® is a closed subgroup 
of G whose Lie algebra is b. 

Show that the result fails if the assumption that G be simply connected is 
omitted. 


(b 


wm 


(c 
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Part II 
Semisimple Lie Algebras 


Chapter 6 
The Representations of sl(3; C) 


6.1 Preliminaries 


In this chapter, we investigate the representations of the Lie algebra sl(3;C), 
which is the complexification of the Lie algebra of the group SU(3). The main 
result of this chapter is Theorem 6.7, which states that an irreducible finite- 
dimensional representation of sl(3;C) can be classified in terms of its “highest 
weight.” This result is analogous to the results of Sect.4.6, in which we classify 
the irreducible representations by the largest eigenvalue of (H), namely the non- 
negative integer m. 

The results of this chapter are special cases of the general theory of repre- 
sentations of semisimple Lie algebras (Chapters 7 and 9) and of the theory of 
representations of compact Lie groups (Chapters 11 and 12). It is nevertheless useful 
to consider this case separately, in part because of the importance of SU(3) in 
physical applications but mainly because seeing roots, weights, and the Weyl group 
“in action” in a simple example motivates the introduction of these structures later 
in a more general setting. 

Every finite-dimensional representation of SU(3) (over a complex vector space) 
gives rise to a representation of su(3), which can then be extended by complex 
linearity to sl(3;C) = su(3)c. Since SU(3) is simply connected, we can go in the 
opposite direction by restricting any representation of sl(3;C) to su(3) and then 
applying Theorem 5.6 to obtain a representation of SU(3). Propositions 4.5 and 4.6 
tell us that a representation of SU(3) is irreducible if and only if the associated rep- 
resentation of sl(3; C) is irreducible, thus establishing a one-to-one correspondence 
between the irreducible representations of SU(3) and the irreducible representations 
of sl(3;C). Furthermore, since SU(3) is compact, Theorem 4.28 then tells us that 
all finite-dimensional representations of SU(3)—and thus, also, of sl(3; C)—are 
completely reducible. 
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It is desirable, however, to avoid relying unnecessarily on Theorem 5.6, which 
in turn relies on the Baker—Campbell—Hausdorff formula. If we look the repre- 
sentations from the Lie algebra point of view, we can classify the irreducible 
representations of sl(3;C) without knowing that they come from representations 
of SU(3). Of course, classifying the irreducible representations of sl(3;C) does 
not tell one what a general representation of sl(3;C) looks like, unless one knows 
complete reducibility. Nevertheless, it is possible to give an algebraic proof of 
complete reducibility, without referring to the group SU(3). This proof is given 
in the setting of general semisimple Lie algebras in Sect. 10.3, but it should be fairly 
easy to specialize the argument to the sl(3; C) case. 

Meanwhile, if we look at the representations from the group point of view, 
we can construct the irreducible representations of SU(3) without knowing that 
every representation of sI(3; C) gives rise to a representation of SU(3). Indeed, the 
irreducible representations of SU(3) are constructed as subspaces of tensor products 
of several copies of the standard representation with several copies of the dual of the 
standard representation. Since the standard representation and its dual are defined 
directly at the level of the group SU(3), there is no need to appeal to Theorem 5.6. 

In short, this chapter provides a self-contained classification of the irreducible 
representations of both SU(3) and sl(3;C), without needing to know the results 
of Chapter 5. We establish results for sl(3;C) first, and then pass to SU(3) 
(Theorem 6.8). 


6.2 Weights and Roots 


We will use the following basis for sl(3; C): 


1 00 00 0 
A,=|0-10], MmM=ļ{01 Of, 

0 00 00-1 

010 000 001 
X,=]O000], X2=]001]), %3=[000], 

000 000 000 

000 000 000 
Y,;={100], %Y={[000], = 1000 

000 010 100 


Note that the span (H1, X1, Yı) of Hı, X1, and Y; is a subalgebra of sl(3; C) 
isomorphic to sl(2;C), as can be seen by ignoring the third row and the third 
column in each matrix. The subalgebra (H2, X2, Y2) is also, similarly, isomorphic 
to sl(2; C). Thus, we have the following commutation relations: 
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[A X= 2%, ba X] = 2X, 
[M1,%1]=-2%, [H Y>] =—-2h, 
[X Y]= He [X Y]= Hh. 


We now list all of the commutation relations among the basis elements which 
involve at least one of Hı and H3. (This includes some repetitions of the above 
commutation relations.) 


[Hı, H] = 0; 

[Hi, Xi] = 2X1, [M1, Yı] = -2N, 

[A2, X1] = X1, [M, Y] = Yı; 

(6.1) 


[Hi, X] = -X2, [M, Y] = h, 
[H2, X2] = 2X2, [M2, Yo] = —2Y3; 


[Hi, X3] = X3, [M, Y5] = -Y3, 
[ie X] = X, [m,f] = Y. 


Finally, we list all of the remaining commutation relations. 


[X,Y] = H, 
[Xo, Y] = Mh, 
[X3, Y3] = Hı + Mn; 


[X1, Xo] = X3, [%1, Y] = —Y3, 
[X Y] = 0, [X,Y] = 0; 


[X1,X3]= 0, [%,¥3] = 9, 
[Xo, X3] = 0, [Y>, Y3] = 0 


[X2, Y3] = 7%, [X3, Yo] = Xi, 
[X1, ¥%3] = —Yo, [X3, Yi] = —X2. 


All of our analysis of the representations of sl(3; C) will be in terms of the above 
basis. From now on, all representations of sl(3;C) will be assumed to be finite 
dimensional and complex linear. 

Our basic strategy in classifying the representations of sl(3;C) is to simultane- 
ously diagonalize 7 ( H1) and z (H2). (See Sect. A.8 for information on simultaneous 
diagonalization.) Since H; and Hy commute, z (H1) and (A?) will also commute 
(for any representation 7) and so there is at least a chance that 2(H) and 2(H2) 
can be simultaneously diagonalized. (Compare Proposition A.16.) 
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Definition 6.1. If (z,V) is a representation of sl(3;C), then an ordered pair 
u = (mi, m2) € C? is called a weight for z if there exists v Æ 0 in V such that 


u(H,)v = mv, 


u(Hz)v = mv. (6.2) 


A nonzero vector v satisfying (6.2) is called a weight vector corresponding to the 
weight u. If u = (mı, m2) is a weight, then the space of all vectors v satisfying (6.2) 
is the weight space corresponding to the weight u. The multiplicity of a weight is 
the dimension of the corresponding weight space. 


Thus, a weight is simply a pair of simultaneous eigenvalues for 2()) and 
z(ĦH2). It is easily shown that isomorphic representations have the same weights 
and multiplicities. 


Proposition 6.2. Every representation of S\(3; C) has at least one weight. 


Proof. Since we are working over the complex numbers, (H) has at least one 
eigenvalue mı € C. Let W C V be the eigenspace for z( 1) with eigenvalue m;. 
Since [H), M2] = 0, 2(H2) commutes with 2(A), and, so, by Proposition A.2, 
x (H2) must map W into itself. Then the restriction of x (H2) to W must have at least 
one eigenvector w with eigenvalue m2 € C, which means that w is a simultaneous 
eigenvector for m (H) and 2(H2) with eigenvalues m; and mp. o 


Every representation z of sl(3; C) can be viewed, by restriction, as a representa- 
tion of the subalgebras (H1, X1, Yı) and (H2, X2, Y2), both of which are isomorphic 
to sl(2; ©). 


Proposition 6.3. If (x, V) is a representation of S\(3; C) and u = (m,,mz2) is a 
weight of V, then both mı and mz are integers. 


Proof. Apply Point 1 of Theorem 4.34 to the restriction of x to (Hy, X1, Yı) and to 
the restriction of x to (H2, X2, Y2). oO 


Our strategy now is to begin with one simultaneous eigenvector for 7() and 
(H) and then to apply z(X;) or 2(Y;) and see what the effect is. The following 
definition is relevant in this context. 


Definition 6.4. An ordered pair œ = (a), a2) € C? is called a root if 


1. a, and a2 are not both zero, and 
2. there exists a nonzero Z € sl(3; C) such that 


[M,Z] =aZ, 
[H>, Z| = aZ. 


The element Z is called a root vector corresponding to the root a. 
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Condition 2 in the definition says that Z is a simultaneous eigenvector for ady, 
and ady,. This means that Z is a weight vector for the adjoint representation with 
weight (a1, a2). Thus, taking into account Condition 1, we may say that the roots 
are precisely the nonzero weights of the adjoint representation. The commutation 
relations (6.1) tell us that we have the following six roots for sI(3; C): 


a Z a Z 
(2-1) X C21) Yı 
1,2) X% i % mee 
(,1) %3 (-1,-1) Y3 


Note that H, and H are also simultaneous eigenvectors for ady, and ady,, but they 
are not root vectors because the simultaneous eigenvalues are both zero. Since the 
vectors in (6.3), together with Hı and H2, form a basis for sl(3; C), it is not hard 
to show that the roots listed in (6.3) are the only roots (Exercise 1). These six roots 
form a “root system,” conventionally called A. (For much more information about 
root systems, see Chapter 8.) 

It is convenient to single out the two roots corresponding to X; and X3: 


a= (2,—1); Q2 = (=1,2), (6.4) 


which we call the positive simple roots. They have the property that all of the roots 
can be expressed as linear combinations of a and @ with integer coefficients, and 
these coefficients are (for each root) either all greater than or equal to zero or all less 
than or equal to zero. This is verified by direct computation: 


(2,—1) =; (—1,2)=a2; (1,1) =a,;+ a, 


with the remaining three roots being the negatives of the ones above. The decision 
to designate a; and a as the positive simple roots is arbitrary; any other pair of 
roots with similar properties would do just as well. 

The significance of the roots for the representation theory of sl(3; C) is contained 
in the following lemma, which is the analog of Lemma 4.33 in the sI(2; C) case. 


Lemma 6.5. Let a = (a, a2) be a root and let Za € SI(3; C) be a corresponding 
root vector. Let x be a representation of s\(3; C), let u = (mı, m2) be a weight for 
x, and let v # 0 be a corresponding weight vector. Then we have 


a(Hi)a(Za)v = (mı + a))a(Za)v, 
(Ay) a(Zy)v = (m + a)n (Za)v. 


Thus, either 1(Zy)v = 0 or 1(Zq)v is a new weight vector with weight 


u +a = (m, +a, m + a2). 
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Proof. By the definition of a root, we have the commutation relation [H), Za] = 
a, Za. Thus, 


(Hy )1(Zq)v = (1(Zq)a(M) + ain(Zq)) v 
U(Zq)(myv) + aya(Za)v 


= (m; + a1)n(Zq)v. 


A similar argument allows us to compute 2(H2)2(Zq)v. o 


6.3 The Theorem of the Highest Weight 


If we have a representation with a weight y = (m1, m2), then by applying the root 
vectors X;, X2, X3, Yı, Yo, and Y3, we obtain new weights of the form u + a, 
where a is the root. Of course, some of the time, 7(Z,)v will be zero, in which 
case u + & is not necessarily a weight. In fact, since our representation is finite 
dimensional, there can be only finitely many weights, so we must get zero quite 
often. By analogy to the classification of the representations of sl(2;C), we would 
like to single out in each representation a “highest” weight and then work from 
there. The following definition gives the “right” notion of highest. 


Definition 6.6. Leta, = (2,—1) anda = (—1, 2) be the roots introduced in (6.4). 
Let u; and uz be two weights. Then mı is higher than u2 (or, equivalently, 42 is 
lower than 41) if yı — u2 can be written in the form 


fly — u2 = aay + baz (6.5) 


with a > 0 and b > 0. This relationship is written as yı > 42 or W2 < Hı. 
If x is a representation of sI(3; C), then a weight uo for x is said to be a highest 
weight if for all weights u of 2, u < uo. 


Note that the relation of “higher” is only a partial ordering; for example, a; — a 
is neither higher nor lower than 0. In particular, a finite set of weights need not have 
a highest element. Note also that the coefficients a and b in (6.5) do not have to be 
integers, even if both jz; and u2 have integer entries. For example, (1, 0) is higher 
than (0, 0) since (1,0) = =a + ton. 

We are now ready to state the main theorem regarding the irreducible represen- 
tations of sl (3; C), the theorem of the highest weight. The proof of the theorem is 
found in Sect. 6.4. 


Theorem 6.7. 1. Every irreducible representation 1 of s\(3;C) is the direct sum 
of its weight spaces. 
2. Every irreducible representation of s\(3; C) has a unique highest weight n. 
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3. Two irreducible representations of s\(3;C) with the same highest weight are 
isomorphic. 
4. The highest weight u of an irreducible representation must be of the form 


u = (m,,™m2), 


where m, and mz are non-negative integers. 
5. For every pair (m,,m2) of non-negative integers, there exists an irreducible 
representation of S\(3; C) with highest weight (mı, m2). 


We will also prove (without appealing to Theorem 5.6) a similar result for the 
group SU(3). Since every irreducible representation of SU(3) gives rise to an 
irreducible representation of sl(3;C) = su(3)c, the only nontrivial matter is to 
prove Point 5 for SU(3). 


Theorem 6.8. For every pair (mı, m2) of non-negative integers, there exists an 
irreducible representation TI of SU(3) such that the associated representation 1 
of sl(3; C) has highest weight (m1, m2). 


One might naturally attempt to construct representations of SU(3) by a method 
similar to that used in Example 4.10, acting on spaces of homogeneous polynomials 
on C?. This is, indeed, possible and the resulting representations of SU(3) turn out 
to be irreducible. Not every irreducible representation of SU(3), however, arises in 
this way, but only those with highest weight of the form (0, m). See Exercise 8. 

For A = (mı,m2) € C?, we may say that A is an integral element if mı and 
mn are integers and that A is dominant if mı and m3 are real and non-negative. 
Thus, the set of possible highest weights in Theorem 6.7 are the dominant integral 
elements. Figure 6.1 shows the roots and dominant integral elements for sI(3; C). 


Fig. 6.1 The roots (arrows) 
and dominant integral 
elements (black dots), shown 
in the obvious basis 
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This picture is made using the obvious basis for the space of weights; that is, the 
x-coordinate is the eigenvalue of H, and the y-coordinate is the eigenvalue of H3. 
Once we have introduced the Weyl group (Sect. 6.6), we will see the same picture 
rendered using a Weyl-invariant inner product, which will give a more symmetric 
view of the situation. 

Note the parallels between this result and the classification of the irreducible 
representations of sl(2;C): In each irreducible representation of sl(2;C), 7(H) 
is diagonalizable, and there is a largest eigenvalue of 2(H). Two irreducible 
representations of sl(2;C) with the same largest eigenvalue are isomorphic. The 
highest eigenvalue is always a non-negative integer and every non-negative integer 
is the highest weight of some irreducible representation. 


6.4 Proof of the Theorem 


The proof consists of a series of propositions. 


Proposition 6.9. In every irreducible representation (x, V) of sl(3; C), the opera- 
tors n(H,) and m(H2) can be simultaneously diagonalized; that is, V is the direct 
sum of its weight spaces. 


Proof. Let W be the sum of the weight spaces in V. Equivalently, W is the space of 
all vectors w € V such that w can be written as a linear combination of simultaneous 
eigenvectors for x (H, ) and 2 (H2). Since (Proposition 6.2) x always has at least one 
weight, W Æ {0}. 

On the other hand, Lemma 6.5 tells us that if Z, is a root vector corresponding 
to the root a, then 7(Z,) maps the weight space corresponding to jz into the weight 
space corresponding to u + a. Thus, W is invariant under the action of each of the 
root vectors, X1, X2, X3, Y1, Yo, and Y3. Since W is certainly also invariant under the 
action of Hı and H, W is invariant under all of sl(3;C). Thus, by irreducibility, 
W = V. Finally, since, by Proposition A.17, weight vectors with distinct weights 
are independent, V is actually the direct sum of its weight spaces. o 


Definition 6.10. A representation (7, V) of sl(3; C) is said to be a highest weight 
cyclic representation with weight y = (71,72) if there exists v 4 0 in V such 
that 


1. v is a weight vector with weight u, 
2. m(X;)v = 0, for j = 1,2, 3, 
3. the smallest invariant subspace of V containing v is all of V. 


Proposition 6.11. Let (x, V) be a highest weight cyclic representation of s\(3; C) 
with weight u. Then the following results hold. 


1. The representation x has highest weight u. 
2. The weight space corresponding to the weight u is one dimensional. 
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Before turning to the proof of this proposition, let us record a simple lemma 
that applies to arbitrary Lie algebras and which will be useful also in the setting of 
general semisimple Lie algebras. 


Lemma 6.12 (Reordering Lemma). Suppose that g is any Lie algebra and that 
x is a representation of g. Suppose that X\,..., Xm is an ordered basis for g as a 
vector space. Then any expression of the form 


m(X j,)(X jy) aX jy), (6.6) 
can be expressed as a linear combination of terms of the form 
(X m)” Xm- ++ (XY (6.7) 


where each kı is a non-negative integer and where kı + kz +--+ km < N. 


Proof. The idea is to use the commutation relations of g to re-order the factors 
into the desired order, at the expense of generating terms with one fewer factors, 
which then be handled by the same method. To be more formal, we use induction 
on N. If N = 1, there is nothing to do: Any expression of the form 2(X;) is of 
the form (6.7) with k; = 1 and all the other k;’s equal to zero. Assume, then, that 
the result holds for a product of at most N factors, and consider an expression of the 
form (6.6) with N + 1 factors. By induction, we can assume that the last N factors 
are in the desired form, giving an expression of the form 


m(X;)a(Xm)*" n (Xm—1 im! ++ (XK) 


with ki +--- +k, =N. 

We now move the factor of x (X;) to the right one step at a time until it is in the 
right spot. Each time we have 7 (X ; )(X,;) somewhere in the expression we use the 
relation 


(Xj )m(Xx) = w(Xx)a(Xj) + ([Xj, Xe) 


= 1(X)a(Xj) + Y` cun(Xı), 
f 


where the constants cj are the structure constants for the basis {X}; } (Definition 
3.10). Each commutator term has at most at most N factors. Thus, we ultimately 
obtain several terms with N factors, which can be handled by induction, and one 
term with N factors that is of the desired form (once x (X; ) finally gets to the right 
spot). o 


We now proceed with the proof of Proposition 6.11. 


150 6 The Representations of sl(3; C) 


Proof. Let v be as in the definition. Consider the subspace W of V spanned by 
elements of the form 


w = n(Y; aY p) rY jy) (6.8) 


with each j; equal to 1, 2, or 3 and N > 0. (If N = 0, then w = v.) We now claim 
that W is invariant. We take as our basis for sl(3; C) the elements X1, X2, X3, H1, 
Hy, Yı, Y2, and Y3, in that order. If we apply a basis element to w, the lemma tells 
us that we can rewrite the resulting vector as a linear combination of terms in which 
the (X;)’s act first, the 7(H;)’s act second, and the z(Y;)’s act last, and all of 
these are applied to the vector v. Since v is annihilated by each m(X;), any term 
having a positive power of any X; is simply zero. Since v is an eigenvector for each 
x(H;), any factors of 2(;) acting on v can be replaced by constants. That leaves 
only factors of z(Y;) applied to v, which means that we have a linear combination 
of vectors of the form (6.8). Thus, W is invariant and contains v, so W = V. 

Now, Yı, Y2, and Y3 are root vectors with roots —a,, —a@2, and —a, — dQ, 
respectively. Thus, by Lemma 6.5, each element of the form (6.8) with N > 0 
is a weight vector with weight lower than jz. Thus, the only weight vectors with 
weight u are multiples of w. oO 


Proposition 6.13. Every irreducible representation of s\(3;C) is a highest weight 
cyclic representation, with a unique highest weight u. 


Proof. We have already shown that every irreducible representation z is the direct 
sum of its weight spaces. Since the representation is finite dimensional, there can be 
only finitely many weights, so there must be a maximal weight u, that is, such that 
there is no weight strictly higher than u. Thus, for any nonzero weight vector v with 
weight u, we must have 


m(X;)vu=0, j =1,2,3. 


Since x is irreducible, the smallest invariant subspace containing v must be the 
whole space; therefore, the representation is highest weight cyclic. oO 


Proposition 6.14. Suppose (x,V) is a completely reducible representation of 
sl(3; C) that is also highest weight cyclic. Then x is irreducible. 


As it turns out, every finite-dimensional representation of sl(3;C) is completely 
reducible. This claim can be verified analytically (by passing to the simply con- 
nected group SU(3) and using Theorem 4.28) or algebraically (as in Sect. 10.3). We 
do not, however, require this result here, since we will only apply Proposition 6.14 
to representations that are manifestly completely reducible. 

Meanwhile, it is tempting to think that any representation with a cyclic vector 
(that is, a vector satisfying Point 3 of Definition 6.10) must be irreducible, but this 
is false. (What is true is that if every nonzero vector in a representation is cyclic, 
then the representation is irreducible.) Thus, Proposition 6.14 relies on the special 
form of the cyclic vector in Definition 6.10. 
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Proof. Let (x, V) be a highest weight cyclic representation with highest weight u 
and let v be a weight vector with weight u. By assumption, V decomposes as a 
direct sum of irreducible representations 


v= r. (6.9) 
J 


By Proposition 6.9, each of the V;’s is the direct sum of its weight spaces. Since 
the weight u occurs in V, it must occur in some V; (compare the last part of 
Proposition A.17). But by Proposition 6.11, v is (up to a constant) the only vector in 
V with weight u. Thus, V; is an invariant subspace containing v, which means that 
V; = V. There is, therefore, only one term in the sum (6.9), and V is irreducible. O 


Proposition 6.15. Two irreducible representations of S\(3; C) with the same high- 
est weight are isomorphic. 


Proof. Suppose (x, V) and (o, W) are irreducible representations with the same 
highest weight jz and let v and w be the highest weight vectors for V and W, 
respectively. Consider the representation V © W and let U be smallest invariant 
subspace of V @ W which contains the vector (v, w). Then U is a highest weight 
cyclic representation. Furthermore, since V ® W is, by definition, completely 
reducible, it follows from Proposition 4.26 that U is completely reducible. Thus, 
by Proposition 6.14, U is irreducible. 

Consider now the two “projection” maps Pı and P2, mapping V @ W to V and 
W , respectively, and given by 


Pi(v,w) =v; Po(v,w) = vw. 


Since Pı and P, are easily seen to be intertwining maps, their restrictions to 
U C V @W are also intertwining maps. Now, neither Pi|y nor Pz|y is the zero 
map, since both are nonzero on (v, w). Moreover, U, V, and W are all irreducible. 
Therefore, by Schur’s lemma, P;|y is an isomorphism of U with V and P2|y is an 
isomorphism of U with W, showing that V = U = W. oO 


Proposition 6.16. If x is an irreducible representation of s\(3;C) with highest 
weight u = (mı, m2), then mı and m non-negative integers. 


Proof. By Proposition 6.3, mı and mz are integers. If v is a weight vector 
with weight u, then 2(X,)v and 2(X2)v must be zero, or u would not be the 
highest weight for x. Thus, if we then apply Point 1 of Theorem 4.34 to the 
restrictions of x to (H1, X1, Yı) and (H2, X2, Y2), we conclude that mı and m2 
are non-negative. o 


Proposition 6.17. If m; and mz are non-negative integers, then there exists an 
irreducible representation of s\(3; C) with highest weight u = (mı, m2). 
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Proof. Since the trivial representation is an irreducible representation with highest 
weight (0, 0), we need only construct representations with at least one of m; and m2 
positive. 

First, we construct two irreducible representations, with highest weights (1, 0) 
and (0, 1), which we call the fundamental representations. The standard represen- 
tation of sl(3; C), acting on C? in the obvious say, is easily seen to be irreducible. It 
has weight vectors e1, e2, and e3, with corresponding weights (1,0), (—1, 1), and 
(0, —1), and with highest weight is (1,0). The dual of the standard representation, 
given by 


n(Z) = —Z" (6.10) 


for all Z € sl(3; ©), is also irreducible. It also has weight vectors e1, e2, and e3, with 
corresponding weights (—1, 0), (1, —1), and (0, 1) and with highest weight (0, 1). 

Let (7x1, V1) and (72, V2) be the standard representation and its dual, respectively, 
and let v; = e; and v2 = e3 be the respective highest weight vectors. Now, consider 
the representation Tm, m, given by 


(Vi @--- @Vi) @ (V2 ®--- @ V2), (6.11) 


where V; occurs mı times and Vz occurs mz times. The action of sl(3;C) on this 
space is given by the obvious extension of Definition 4.20 to multiple factors. It then 
easy to check that the vector 


Umm, = V1 Q Vip+++ @ Vi B V2 @ v2-++ @W v2 


is a weight vector with weight (mı,m2) and that vm, m, is annihilated by 
Tiny na (Xi), J = l, 2, 3. 

Now let W be the smallest invariant subspace containing Um, m,. Assuming 
that 2m,m, is completely reducible, W will also be completely reducible and 
Proposition 6.14 will tell us that W is the desired irreducible representation with 
highest weight (m1, m2). 

It remains only to establish complete reducibility. Note first that both the standard 
representation and its dual are “unitary” for the action of Su(3), meaning that 
m(X)* = —x(X) for all X € su(3). Meanwhile, it is easy to verify (Exercise 5) 
that if V and W are inner product spaces, then there is a unique inner product on 
V ® W for which 


(vi 8 w1, V2 @ wo) = (v1, v2) (w1, w2) 


for all v1, v2 € V and wı,w2 € W. Extending this construction to tensor products 
of several vector spaces, use the standard inner product on C? to construct an inner 
product on the space in (6.11). It is then easy to check that 2m, m, is also unitary for 
the action of su(3). Thus, by Proposition 4.27, Zm, m, is completely reducible under 
the action of su(3) and thus, also, under the action of sl(3;C) = su(3)c. Oo 
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We have now completed the proof of Theorem 6.7. 


Proof of Theorem 6.8. The standard representation m; of sl(3;C) comes from the 
standard representation II, of SU(3), and similarly for the dual of the standard 
representation. By taking tensor products, we see that there is a representation 
T]n,m, corresponding to the representation mmi m, of sl(3;C). The irreducible 
invariant subspace W in the proof of Proposition 6.17 is then also invariant under the 
action of SU(3), so that the restriction of Tm; m, to W is the desired representation 


of SU(3). o 


6.5 An Example: Highest Weight (1, 1) 


To obtain the irreducible representation with highest weight (1,1), we take the 
tensor product of the standard representation and its dual, take the highest weight 
vector in the tensor product, and then consider the space obtained by repeated 
applications of the operators x1, (Y;), j = 1,2,3. Since, however, Y3 = —[Y1, Y2], 
it suffices to apply only 71, (Yı) and 71, (Y2). 

Now, the standard representation has highest weight e; and the action of the 
operators x (Y) = Y, and z(Y2) = Y; is given by 


Yie: = e? Yie =0 Yie =0 
Ye} =0 Ye = 63 Ye3 = 0° 


For the dual of the standard representation, let use the notation Z = —Z", so that 


u(Z) = Z. If we introduce the new basis 


fi=es fp=—-exr fp=ei, 


then the highest weight is fı and we have 


NA=0 hsf NA=0 
YMf=h YY fp =0 Y fp = 0 


We must now repeatedly apply the operators 


mi(i)=Y% @1+1@QNY 
mı) = ¥2 @T +1 @ Yr (6.12) 


until we get zero. This calculation is contained in the following chart. Here, there are 
two arrows coming out of each vector. Of these, the left arrow indicates the action 
of Yı, @1 + I QY, and the right arrow indicates the action of Y2 @7 + I & Y>. To 
save space, we omit the tensor product symbol, writing, for example, e2 fo instead 
of e: ® h. 
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efi 

L \ 

efi eh 
L 4 4 N 
0 efitesfr afte fs 0 
eo fs 2e3 fo 2e f ezh 
Lv 4 l N 4 LN 
0 Bh 2e; f3 0 2e3 f3 ef; 0 


A basis for the space spanned by these vectors is e1 fi, €2 fi, e1 fo, fi + efh, 
e2 fo + eif, €2f3, €3 fo, and e3 fz. Thus, the dimension of this representation is 
8; it is (isomorphic to) the adjoint representation. Now, e1, e2, and e3 have weights 
(1,0), (—1, 1), and (0, —1), respectively, whereas fi, f2, and f3 have weights (0, 1), 
(1, —1), and (—1, 0), respectively. From (6.12), we can see that the weight for e; ® fx 
is just the sum of the weight for e; and the weight for fx. Thus, the weights for 
the basis elements listed above are (1, 1), (—1, 2), (2,—1), (0,0) (twice), (1, —2), 
(—2, 1), and (—1, —1). Each weight has multiplicity 1 except for (0,0), which has 
multiplicity 2. See the first image in Figure 6.4. 


6.6 The Weyl Group 


This section describes an important symmetry of the representations of SU(3), 
involving something called the Weyl group. Our discussion follows the compact- 
group approach to the Weyl group. See Sect. 7.4 for the Lie algebra approach, in the 
context of general semisimple Lie algebras. 


Definition 6.18. Let h be the two-dimensional subspace of sl(3; C) spanned by H; 
and H2. Let N be the subgroup of SU(3) consisting of those A € SU(3) such 
that Ad4(Ħ) is an element of h for all H in b. Let Z be the subgroup of SU(3) 
consisting of those A € SU(3) such that Ad4(H) = H forall H € b. 


The space h is a Cartan subalgebra of sl(3; C). It is a straightforward exercise 
(Exercise 9) to verify that Z and N are subgroups of SU(3) and that Z is a normal 
subgroup of N . This leads us to the definition of the Weyl group. 


Definition 6.19. The Weyl group of SU(3), denoted W, is the quotient 
group N/Z. 


The primary significance of W for the representation theory of SU(3) is that 
it gives rise to a symmetry of the weights occurring in a fixed representation; see 
Theorem 6.22. We can define an action of W on b as follows. For each element w 
of W, choose an element A of the corresponding coset in N. Then for H in h we 
define the action w - H of w on H by 


w: H = Ad4 (H). 
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To see that this action is well defined, suppose B is an element of the same coset 
as A. Then B = AC with C € Z and, thus, 


Adg(H) = Ada(Adc (H)) = Ady(#1), 


by the definition of Z. Note that by definition, if w- H = H forall H € b, then w is 
the identity element of W (that is, the associated A € N is actually in Z). Thus, we 
may identify W with the group of linear transformations of þh that can be expressed 
in the form H > w- H for some w € W. 


Proposition 6.20. The group Z consists precisely of the diagonal matrices inside 
SU(3), namely the diagonal matrices with diagonal entries (e'®, e'? , e+) with 
0,¢@ € R. The group N consists of precisely those matrices A € SU(3) such that 
for each j = 1,2,3, there existk; € {1,2,3} and 6; € R such that Ae; = eli eg. 
Here, e; e2, e3 is the standard basis for o, 

The Weyl group W = N/Z is isomorphic to the permutation group on three 
elements. 


Proof. Suppose A is in Z, which means that A commutes with all elements of 5, 
including H,, which has eigenvectors e4, e2, and e3, with corresponding eigenvalues 
1, —1, and 0. Since A commutes with H4, it must preserve each of these eigenspaces 
(Proposition A.2). Thus, Ae; must be a multiple of e; for each j, meaning that A is 
diagonal. Conversely, any diagonal matrix in SU(3) does indeed commute not only 
with H; but also with H and, thus, with every element of b. 

Suppose, now, that A is in N. Then AH 147! must be in h and therefore must 
be diagonal, meaning that e1, e2, and e3 are eigenvectors for AH ,A7!, with the 
same eigenvalues 1, —1,0 as H4, but not necessarily in the same order. On the other 
hand, the eigenvectors of AH, A~! must be Ae;, Aez, and Ae3. Thus, Ae; must be 
a multiple of some ex; , and the constant must have absolute value 1 if A is unitary. 
Conversely, if Ae; is a multiple of ex, for each j, then for any (diagonal) matrix H 
in b, the matrix AHA™! will again be diagonal and thus in b. 

Finally, if A maps each e; to a multiple of e,,, for some k; depending on j, 
then for each diagonal matrix H, the matrix AHA™! will be diagonal with diagonal 
entries rearranged by the permutation j +> k;. For any permutation, we can choose 
the constants to that the map taking e; to es ex; has determinant 1, showing that 
every permutation actually arises in this way. Thus, W—which we think of as the 
group of linear transformations of h of the form Ad4, A € N—is isomorphic to the 
permutation group on three elements. o 


We want to show that the Weyl group is a symmetry of the weights of any finite- 
dimensional representation of sl(3;C). To understand this, we need to adopt a less 
basis-dependent view of the weights. We have defined a weight as a pair (m1, m2) 
of simultaneous eigenvalues for 2(H) and 2(H>2). However, if a vector v is an 
eigenvector for (1) and (#2) then it is also an eigenvector for 2(H) for any 
element H of the space h spanned by H and AM, and the eigenvalues will depend 
linearly on H in b. Thus, we may think of a weight not as a pair of numbers but as 
a linear functional on b. 
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It is then convenient to use an inner product on b to identity linear functionals on 
b with elements of h itself. We define the inner product of H and H” in h by 


(H, H’) = trace(H*H’'), (6.13) 
or, explicitly, 
(diag(a, b,c), diag(d,e, f)) = ad + be + cf, 


where diag(-,-,-) is the diagonal matrix with the indicated diagonal entries. If ¢ is 
a linear functional on 4, there is (Proposition A.11) a unique vector A in h such 
that @ may be represented as 6(H) = (A, H} for all H € b. If we represent the 
linear functional in the previous paragraph in this way, we arrive at a new, basis- 
independent notion of a weight. 


Definition 6.21. Let h be the subspace of sl(3;C) spanned by H; and M; and let 
(x, V) be a representation of sl(3; C). An element A of h is called a weight for x if 
there exists a nonzero vector v in V such that 


u(H)v = (A, H) v 


for all H in b. Such a vector v is called a weight vector with weight À. 
If À is a weight in our new sense, the ordered pair (mı, m2) in Definition 6.1 is 
given by 
mı = (å, Hı); mz = (A, Mo). 


It is easy to check that for all U € N, the adjoint action of U on h preserves 
the inner product in (6.13). Thus, the action of the Weyl group on b is unitary: 
(w- H,w- H’) = (H, H’). Since the roots are just the nonzero weights of the 
adjoint representation, we now also think of the roots as elements of b. 


Theorem 6.22. Suppose that (TI, V) is a finite-dimensional representation of 
SU(3) with associated representation (x, V) of sl(3;C). If A € b is a weight for 
V then w- À is also a weight of V with the same multiplicity. In particular, the roots 
are invariant under the action of the Weyl group. 


Proof. Suppose that A is a weight for V with weight vector v. Then for all U € N 
and H € h, we have 


n(H)M(U)v = 1(U)(M1(U)!2(A)T(U))v 
= T(U)x(U'AU)v 


= (A,U~'HU) (Uv. 


Here, we have used that U is in N, which guarantees that U -IHU is, again, in b. 
Thus, if w is the Weyl group element represented by U, we have 
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n(H)T(U)v = (A,w' - H) O(U)v = (w-A, H) O(U)v. 


We conclude that II(U)v is a weight vector with weight w- A. 

The same sort of reasoning shows that II(U) is an invertible map of the weight 
space with weight A onto the weight space with weight w - A, whose inverse is 
T1(U)~'. This means that the two weights have the same multiplicity. oO 


To represent the basic weights, (1,0) and (0, 1), in our new approach, we look 
for diagonal, trace-zero matrices uı and u2 such that 
(mı, Hı) =1, (mı, Ho) =0 
(m2, Hı) =0, (m2, Ho) = 1. 


These are easily found as 
fy = diag(2/3,—1/3,—1/3); p2 = diag(1/3, 1/3, —2/3). 
The positive simple roots (2, —1) and (—1, 2) are then represented as 


a) = 2 — w = diag(1, —1,0); 
Q2 = —Hı + 2m2 = diag(0, 1, —1). (6.14) 


Note that both a, and œz have length /2 and (a,@2) = —1. Thus, the angle 0 
between them satisfies cos 0 = —1/2, so that 0 = 27/3. 

Figure 6.2 shows the same information as Figure 6.1, namely, the roots and 
the dominant integral elements, but now drawn relative to the Wey]-invariant inner 
product in (6.13). We draw only the two-dimensional real subspace of h consisting 


Fig. 6.2 The roots and @ 
dominant integral elements 

for sl(3; C), computed 

relative to a Weyl-invariant k @ © 
inner product 
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Fig. 6.3 The Weyl group is Re 
the symmetry group of the 
indicated equilateral triangle 


of those elements u such that (u, Hı) and (u, H2) are real, since all the roots and 
weights have this property. Let wi1,2,3) denote the Weyl group element that acts by 
cyclically permuting the diagonal entries of each H € b. Then waq ,2,3) takes a to a 
and a2 to —(a@ + 2), which is a counterclockwise rotation by 27/3 in Figure 6.2. 
Similarly, if w(1,2) the element that interchanges the first two diagonal entries of 
H € b, then wa,2) maps a to —a and a to a + a. Thus, wa,2) is the reflection 
across the line perpendicular to a. The reader is invited to compute the action of the 
remaining elements of the Wey] group and to verify that it is the symmetry group of 
the equilateral triangle in Figure 6.3. 

We previously defined a pair (7), m2) to be integral if mı and m2 are integers and 
dominant if mı > 0 and m = 0. These concepts translate into our new language 
as follows. If A € 6, then À is integral if (A, Hı) and (A, H2) are integers and A 
is dominant if (A, Hı) > 0 and (A, H2) > 0. Geometrically, the set of dominant 
elements is a sector spanning an angle of 2/3. 


6.7 Weight Diagrams 


In this section, we display the weights and multiplicities for several irreducible 
representations of sl(3;C). Figure 6.4 covers the irreducible representations with 
highest weighs (1,1), (1,2), (0,4), and (2,2). The first of these examples was 
analyzed in Sect. 6.5, and the other examples can be analyzed by the same method. 
In each part of the figure, the arrows indicate the roots, the two black lines indicate 
the boundary of the set of dominant elements, and the dashed lines indicate the 
boundary of the set of points lower than the highest weight. Each weight of 
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Fig. 6.4 Weight diagrams for representations with highest weights (1, 1), (1, 2), (0, 4), and (2, 2) 


a particular representation is indicated by a black dot, with a number next to 
a dot indicating its multiplicity. A dot without a number indicates a weight of 
multiplicity 1. 

Our last example is the representation with highest weight (9, 2) (Figure 6.5), 
which cannot feasibly be analyzed using the method of Sect.6.5. Instead, the 
weights are determined by the results of Sect. 6.8 and the multiplicities are computed 
using the Kostant multiplicity formula. (See Figure 10.8 in Sect. 10.6.) See also 
Exercises 11 and 12 for another approach to computing multiplicities. 


6.8 Further Properties of the Representations 


Although we now have a classification of the irreducible representations of sl(3; C) 
by means of their highest weights, there are other things we might like to know 
about the representations, such as (1) the other weights that occur, besides the 
highest weight, (2) the multiplicities of those weights, and (3) the dimension of the 
representation. In this section, we establish which weights occur and state without 
proof the formula for the dimension. A formula for the multiplicities and a proof of 
the dimension formula are given in Chapter 10 in the setting of general semisimple 
Lie algebras. 
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Fig. 6.5 Weight diagram for the irreducible representation with highest weight (9, 2) 


Definition 6.23. If v;,..., vy are elements of a real or complex vector space, the 
convex hull of v;,..., vy is the set of all vectors of the form 


C101 + C202 + +++ + CNUN 


where the c;’s are non-negative real numbers satisfying cı + c2 +++ cn = 1. 


Equivalently, the convex hull of v1,...,uy is the smallest convex set that 
contains all of the v;’s. 


Theorem 6.24. Let u be a dominant integral element and let V, be the irreducible 
representation with highest weight u. If A is a weight of Vy, then À satisfies the 
following two conditions: (1) u — À can be expressed as an integer combination of 
roots, and (2) À belongs to the convex hull of W - u, the orbit of u under the action 
of W. 


Proof. According to the proof of Proposition 6.11, V, is spanned by vectors of 
the form in (6.8). These vectors are weight vectors with weights of the form A := 
H — Qj; —+++— aj,. Thus, every weight of V, satisfies the first property in the 
theorem. 
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Fig. 6.6 The integral element A is outside the convex hull of the orbit of jz, and the element w- A 
is not lower than ju 


The second property in the theorem is based on the following idea: If À is a 
weight of V,,, then w - À is also a weight for all w € W, which means that w - À is 
lower than u. We can now argue “pictorially” that if A were not in the convex hull 
of W - u, there would be some w € W for which w- À is not lower than p, so that A 
could not be a weight of V,,. See Figure 6.6. 

We can give a more formal argument as follows. For any weight À of V,,, we 
can, by Exercise 10, find some w € W so that A’ := w - A is dominant. Since A’ 
is also a weight of V,,, we must have A’ < u. Thus, A’ is in the quadrilateral Q, 
consisting of dominant elements that are lower than jz (Figure 6.7). We now argue 
that the vertices of Q,, are all in the convex hull. First, it is easy to see that for any 
L, the average of w- jz over all w € W is zero, which means that 0 is in E,,. Second, 
the vertices marked vı and v2 in the figure are expressible as follows: 


1 1 

v = Peal ee 
1 1 

U = ght z2 M» 


where Sy, and sy, are the Weyl group elements given by reflecting about the lines 
orthogonal to œı and @. Thus, all the vertices of Q „ are in E',, from which it follows 
that Q,, itself is contained in E',,. 

Now, W - u is clearly W-invariant, which means that E, is also W-invariant. 
Since V’ € QO, C Ey, we have A = w™!A' € E, as well. o 
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Fig. 6.7 The shaded quadrilateral is the set of all points that are dominant and lower than jz 


Theorem 6.25. Suppose V, is an irreducible representation with highest weight u 
and that À is an integral element satisfying the two conditions in Theorem 6.24. 
Then À is a weight of V,. 


Theorem 6.25 says, in effect, that there are no unexpected holes in the set of 
weights of V,,. The key to the proof is the “no holes” result (Point 4 of Theorem 4.34) 
we previously established for sI(2; C). 


Lemma 6.26. Let y be a weight of V,,, let a be a root, and let sy € W be the 
reflection about the line orthogonal to a. Suppose A is a point on the line segment 
joining y to Sy + y with the property that y — A is an integer multiple of a. Then À is 
also a weight of V,. 


See Figure 6.8 for an example. Note from Figure 6.3 that for each root œ, the 
reflection Sy is an element of the Weyl group. 


Proof. Since the reflections associated to œ and —q are the same, it suffices to 
consider the roots @, @2, and a3 := a, + Q2. If we let H; = Hı + Hh, then 
for j = 1,2,3 we havea subalgebras; = (Gy Yj, H;) isomorphic to sI(2; C) such 
that X; is a root vector with root œ; and Y; is a root vector with root —q;. Since 


[Hj, Xj] = 2X; = (aj, Hj) X; 


we have (aj, H;) = 2 for each j. 

Let us now fix a weight y of V, and let U be the span of all the weight vectors in 
V,, whose weights are of the form y + ka; for some real number k. (These weights 
are circled in Figure 6.8.) Since, by Lemma 6.5, m(X;) and z(Y;) shift weights 
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Fig. 6.8 Since y is a weight of V,,, each of the elements y — a, y — 2a,..., Sa * y must also be a 
weight of V, 


by +a;, we see that U is invariant under s; and thus constitutes a representation 
of s; (not necessarily irreducible). With our new perspective that roots are elements 
of h, we can verify from (6.14) that for each j, we have a; = H;, from which 
it follows that se, - H; = —H;. Thus, if u and v are weight vectors with weights 
y and Sq - y, respectively, u and v are in U and are eigenvectors for m(H;) with 
eigenvalues (y, H z and 


(Sa: y, Hj) = (Y: Sa + Hj) = —(y, Hj). 


respectively. 

If A is on the line segment joining y to sy-y, we see that (A, H;) is between (y, H;) 
and (Se -y, H j) =— (y, Hj ). If, in addition, À differs from y by an integer multiple 
of œj, then (y, H;) differs from (A, H;) by an integer multiple of (æj f H;) = 2. 
Thus, by applying Point 4 of Theorem 4.34 to the action of s; on U, there must be 
an eigenvector w for x(H;) in U with eigenvalue / = (A, H A Since the unique 
weight of the form y + ka; for which (y + kaj, H;) = (A, H;) is the one where 
y + ka; =A, we conclude that À is a weight of V,. o 


Proof of Theorem 6.25. Suppose that À satisfies the two conditions in the theorem, 
and write A = u — nœ; — n22. Consider first the case nı > ny, so that 


à = u — (nı — n2)&ı — na(œ1 + a2) 


= u — (nı — n2)&1 — N203, 
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Fig. 6.9 By applying Lemma 6.26 twice, we can see that y and À must be weights of V, 


where a3 = + &2. If we start at A and travel in the direction of œ3, we will hit the 
boundary of £, at the point 


y := u — (nı — m )aı. 


(See Figure 6.9.) Thus, y is in E, and must therefore be between u and Sq, - u. Since 
also y differs from u by an integer multiple of a; (namely nı — n2) Lemma 6.26 
says that y is a weight of V. Meanwhile, À is between y and Sa, - y (see, again, 
Figure 6.9) and differs from y by an integer multiple of a3 (namely n2). Thus, the 
lemma tells us that A must be a weight of V, as claimed. If nı < n2, we can use a 
similar argument with the roles of a; and a reversed. oO 


We close this section by stating the formula for the dimension of an irreducible 
representation of sl(3;C). We will prove the result in Chapter 10 as a special case 
of the Weyl dimension formula. 


Theorem 6.27. The dimension of the irreducible representation with highest 
weight (mı, m2) is 


1 
zum + 1)Q@nz + 1)(m, + m2 + 2). 


The reader is invited to verify this formula by direct computation in the 
representations depicted in Figure 6.4. 
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6.9 Exercises 


1. Show that the roots listed in (6.3) are the only roots. 

2. Let x be an irreducible finite-dimensional representation of sI(3; C) acting ona 
space V and let 2* be the dual representation to x, acting on V*, as defined in 
Sect. 4.3.3. Show that the weights of 2* are the negatives of the weights of x. 
Hint: Choose a basis for V in which both z (H1) and (A>) are diagonal. 

3. Let x be an irreducible representation of sl(3; C) with highest weight ju. 


(a) Leta3 = a + @ and let sy, denote the reflection about the line orthogonal 
to a3. Show the lowest weight for 7 is Sq, - 4. 

(b) Show that the highest weight for the dual representation 2* to m is the 
weight 


1 


H 5 So ` H. 


(c) Let u; and u be the fundamental weights, as in Figure 6.2. If u = 
mıı +mMzH2, show that u’ = mau; +m: y2. That is to say, the dual to the 
representation with highest weight (mı, m2) has highest weight (m2, mı). 


4. Consider the adjoint representation of sl(3;(C) as a representation of sl(2; C) 
by restricting the adjoint representation to the subalgebra spanned by X1, Y1, 
and Hı. Decompose this representation as a direct sum of irreducible represen- 
tations of sl(2; C). Which representations occur and with what multiplicity? 

5. Suppose that V and W are finite-dimensional inner product spaces over C. 
Show that there exists a unique inner product on V @ W such that 


(v @w,v' ®@ w) = (v, vo’) (w, w’) 


for all v, v’ € V andw,w’ € W. 
Hint: Let {e;} and { fz} be orthonormal bases for V and W, respectively. Take 
the inner product on V @ W for which {e; ® fk} is an orthonormal basis. 

6. Following the method of Sect. 6.5, work out the representation of sl(3;C) with 
highest weight (2,0), acting on a subspace of C? @ C?. Determine all the 
weights of this representation and their multiplicity (i.e., the dimension of the 
corresponding weight space). Verify that the dimension formula (Theorem 6.27) 
holds in this case. 

7. Consider the nine-dimensional representation of sl(3;C) considered in 
Sect.6.5, namely the tensor product of the representations with highest 
weights (1,0) and (0,1). Decompose this representation as a direct sum of 
irreducibles. Do the same for the tensor product of two copies of the irreducible 
representation with highest weight (1, 0). (Compare Exercise 6.) 
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8. 


10. 


11. 


12. 


6 The Representations of sl(3; C) 


Let W,, denote the space of homogeneous polynomials on C? of degree m. Let 
SU(3) act on W, by the obvious generalization of the action in Example 4.10. 


(a) Show that the associated representation of sl(3;C) contains a highest 
weight cyclic representation with highest weight (0, m) and highest weight 
vector z3’. 

(b) By imitating the proof of Proposition 4.11, show that any nonzero invariant 
subspace of Wm must contain 23’. 


(c) Conclude that W,,, is irreducible with highest weight (0, m). 


. Show that Z and N (defined in Definition 6.18) are subgroups of SU(3). Show 


that Z is anormal subgroup of N. 

Suppose A is an integral element, that is, one of the triangular lattice points in 
Figure 6.2. Show that there is an element w of the Weyl group such that w - À is 
dominant integral, that is, one of the black dots in Figure 6.2. 

Hint: Recall that the Weyl group is the symmetry group of the triangle in 
Figure 6.3. 


(a) Regard the Weyl group as a group of linear transformations of h. Show that 
—TI is not an element of the Wey] group. 

(b) Which irreducible representations of sl(3;C) have the property that their 
weights are invariant under —/? 


Suppose (7, V) is an irreducible representation of sl(3;C) with highest weight 
je and highest weight vector vg. Show that the weight space with weight 
H — &ı — 2 has multiplicity at most 2 and is spanned by the vectors 


a(Yı)a(Y2)vo, m(¥%2)m(%1) v0. 


Let (x, V) be the irreducible representation with highest weight (m1, m2). As in 
the proof of Proposition 6.17, choose an inner product on V such that 7(X)* = 
—(X) for all X € su(3) C sl(3;C). Let vo be a highest weight vector for V, 
normalized to be a unit vector, and define vectors u; and uz in V as 


uy = n (Yi) (Y2)vo; u = W(Y2)2 (V1) v0. 


Each of these vectors is either zero or a weight vector with weight u—&; —&2. 


(a) Using and the commutation relations among the basis elements of sl(3; ©), 
show that 


(uy, u) = m2(m, + 1) 
(uz, U2) = mı (m2 + 1) 


(u, U2) = m\m2. 


Hint: Show that 1(X;)* = 2(¥;) for j = 1, 2,3. 
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Conclude that the weight  — a; — a has multiplicity 2. 
(c) Show that if mı = 0 and m2 > 1 or mı > 1 and m3 = O, then the weight 
H — &ı — Gz has multiplicity 1. 


(b) Show that ifm, > 1 and m2 > 1 then u and uz are linearly independent. 


Note: The reader may verify the results of this exercise in the representations 
depicted in Figure 6.4. 


Chapter 7 
Semisimple Lie Algebras 


In this chapter we introduce a class of Lie algebras, the semisimple algebras, for 
which we can classify the irreducible representations using a strategy similar to 
the one we used for sl(3;C). In this chapter, we develop the relevant structures 
of semisimple Lie algebras. In Chapter 8, we look into the properties of the set 
of roots. Then in Chapter 9, we construct and classify the irreducible, finite- 
dimensional representations of semisimple Lie algebras. Finally, in Chapter 10, 
we consider several additional properties of the representations constructed in 
Chapter 9. Meanwhile, in Chapters 11 and 12, we consider representation theory 
from the closely related viewpoint of compact Lie groups. 


7.1 Semisimple and Reductive Lie Algebras 


We begin by defining the term semisimple. There are many equivalent characteriza- 
tions of semisimple Lie algebras. It is not, however, always easy to prove that two 
of these various characterizations are equivalent. We will use an atypical definition, 
which allows for a rapid development of the structure of semisimple Lie algebras. 
Recall from Sect. 3.6 the notion of the complexification of a real Lie algebra. 


Definition 7.1. A complex Lie algebra g is reductive if there exists a compact 
matrix Lie group K such that 


g= tc. 
A complex Lie algebra g is semisimple if it is reductive and the center of g is trivial. 


Definition 7.2. If g is a semisimple Lie algebra, a real subalgebra € of g is a 
compact real form of g if € is isomorphic to the Lie algebra of some compact matrix 
Lie group and every element Z of g can be expressed uniquely as Z = X + iY, 
with X,Y e €. 
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On the one hand, using Definition 7.1 gives an easy method of constructing 
Cartan subalgebras and fits naturally with our study of compact Lie groups in 
Part III. On the other hand, this definition covers an apparently smaller class of 
Lie algebras than some of the more standard definitions. That is to say, we will 
prove (Theorem 7.8 and Exercise 6) that the condition in Definition 7.1 implies 
two of the standard definitions of “semisimple,” but we will not prove the reverse 
implications. These reverse implications are, in fact, true, so that our definition of 
semisimplicity is ultimately equivalent to any other definition. But it is not possible 
to prove the reverse implications without giving up the gains in efficiency that go 
with Definition 7.1. The reader who wishes to see a development of the theory 
starting from a more traditional definition of semisimplicity may consult Chapter 
II (along with the first several sections of Chapter I) of [Kna2]. 

The only time we use the compact group in Definition 7.1 is to construct the inner 
product in Proposition 7.4. In the standard treatment of semisimple Lie algebras, the 
Killing form (Exercise 6) is used in place of this inner product. Our use of an inner 
product in place of the bilinear Killing form substantially simplifies some of the 
arguments. Notably, in our construction of Cartan subalgebras (Proposition 7.11), 
we use that a skew self-adjoint operator is always diagonalizable. By contrast, 
an operator that is skew symmetric with respect to a nondegenerate bilinear form 
need not be diagonalizable. Thus, the construction of Cartan subalgebras in the 
conventional approach is substantially more involved than in our approach. 

For a complex semisimple Lie algebra g, we will always assume we have chosen 
a compact real form £ of g, so that g = tc. 


Example 7.3. The following Lie algebras are semisimple: 


sl(n;C), n>2 
so(n;C), n>3 


sp(7;C), n>. 


The Lie algebras gl(n; C) and so(2; C) are reductive but not semisimple. 


Proof. Itis easy to see that the listed Lie algebras are reductive, with the correspond- 
ing compact groups K being SU(n), SO(n), Sp(n), U(n), and SO(2), respectively. 
(Compare (3.17) in Sect. 3.6.) The Lie algebra gl(n;C) has a nontrivial center, 
consisting of scalar multiples of the identity, while the Lie algebra so(2; C) is 
commutative. It remains only to show that the centers of sl(n; C), so(n; C), and 
sp(n; C) are trivial for the indicated values of n. 

Consider first the case of sl(n;C) and let X be an element of the center of 
sl(n;C). For any 1 < j,k < n, let E; be the matrix with a 1 in the (j,k) spot 
and zeros elsewhere. Consider the matrix H; € sl(n;C) given by 


Ax = Ej — Erk, 
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for j < k. Then we may easily calculate that 
0 = [Hk, X] = 2X Ejk — 2X iG Ey. 


Since E and E, are linearly independent for j < k, we conclude that 
Xjx = Xy = 0. Since this holds for all j < k, we see that X must be diagonal. 
Once X is known to be diagonal, we may compute that for j Æ k, 


0 = [X, Ex] = (Xj — Xr) Ex. 


Thus, all the diagonal entries of X must be equal. But since, also, trace(X) = 0, we 
conclude that X must be zero. 

For the remaining semisimple Lie algebras in Example 7.3, the calculations in 
Sect. 7.7 will allow us to carry out a similar argument. It is proved there that the 
center of so(2n; C) is trivial for n > 2, and a similar analysis shows that the centers 
of so(2n + 1; C) and sp(n; C) are also trivial. o 


Proposition 7.4. Let g := tc be a reductive Lie algebra. Then there exists an inner 
product on g that is real valued on € and such that the adjoint action of € on g is 
“unitary,” meaning that 


(adx (Y), Z) = — (Y, adx (Z)) (7.1) 


for all X € tand all X,Y € g. If we define a operation X œ> X* on g by the 
formula 


(X, + iX2)* =-Xı +iX2 (7.2) 
for Xı, X2 € £, then any inner product satisfying (7.1) also satisfies 
(adx (Y), Z) = (Y, adx» (Z)) (1.3) 


forall X, Y, and Z ing. 


The motivation for the definition of X* is that g = gl(n; C) and £ = u(n), then 
X* is the usual matrix adjoint of X. 


Proof. By the proof of Theorem 4.28, there is a (real-valued) inner product on € 
that is invariant under the adjoint action of K. This inner product then extends to a 
complex inner product on g = £c for which the adjoint action of K is unitary. Thus, 
by Proposition 4.8, the adjoint action of € on g is unitary in the sense of (7.1). It is 
then a simple matter to check the relation (7.3). oO 


Recall from Definition 3.9 what it means for a Lie algebra to decompose as a Lie 
algebra direct sum of subalgebras. 
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Proposition 7.5. Suppose g = €c is a reductive Lie algebra. Choose an inner 
product on g as in Proposition 7.4. Then if h is an ideal in g, the orthogonal 
complement of h is also an ideal, and g decomposes as the Lie algebra direct sum 


ofh and ht. 


Proof. An ideal in g is nothing but an invariant subspace for the adjoint action of g 
on itself. If h is a complex subspace of g and is invariant under the adjoint action 
of £, it will also be invariant under adjoint action of g = €c. Thus, if h is an ideal 
in g, then by Proposition 4.27, h+ is invariant under the adjoint action of € and is, 
therefore, also an ideal. 

Now, g decomposes as a vector space as g = h @ f+. But since both h and ht 
are ideals, for all X € h and Y € b+, we have 


[X,Y] € bN ht = {0}. 


Thus, g is actually the Lie algebra direct sum of h and ht. o 


Proposition 7.6. Every reductive Lie algebra g over C decomposes as a Lie 
algebra direct sum g = gı ®3, where 3 is the center of g and where g; is semisimple. 


Proof. Since 3 is an ideal in g, Proposition 7.5 shows that g; := 3+ is also an ideal 
in g and that g decomposes as a Lie algebra direct sum g = gı @3. It remains only to 
show that gı is semisimple. It is apparent that gı has trivial center, since any central 
element Z of gı would also be central in g = gı ® 3, in which case, Z must be in 
3 N gi = {0}. Thus, we must only construct a compact real form of g1. 

It is easy to see that Z € g belongs to 3 if and only if Z commutes with every 
element of €. From this, it is easily seen that 3 is invariant under “conjugation,” 
that is, under the map X + iY > X — iY, where X,Y e €. Since 3 is invariant 
under conjugation, its orthogonal complement g; is also closed under conjugation. It 
follows that 3 is the complexification of 3’ := ¿N£ and that g; is the complexification 
of é := g1 N. 

We will now show that €; is a compact real form of gı. Let K’ be the adjoint group 
of K, that is, the image in GL(t) of the adjoint map. Since K is compact and the 
adjoint map is continuous, K’ is compact and thus closed. Now, by Proposition 3.34, 
the Lie algebra map associated to the map A + Ady is the map X +> ady, the 
kernel of which is the center 3’ of ¢. Thus, the image of ad is isomorphic to €/3 = t1, 
showing that £; is isomorphic to the Lie algebra of the image of Ad, namely K’. O 


Proposition 7.7. If K is a simply connected compact matrix Lie group with Lie 
algebra £, then g := €c is semisimple. 


Proof. As in the proof of Proposition 7.6, € decomposes as a Lie algebra direct sum 
t = €; @ 3’, where 3’ is the center of € and where gı := (€:)c is semisimple. Then 
by Theorem 5.11, K decomposes as a direct product of closed, simply connected 
subgroups K, and Z’. Now, since 3’ is commutative, it is isomorphic to the Lie 
algebra of the commutative Lie group R”, for some n. Since both Z’ and R” are 
simply connected, Corollary 5.7 tells us that Z’ = R”. On the other hand, since Z’ 
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is a closed subgroup of K, we see that Z’ = R” is compact, which is possible only 
ifn = 0. Thus, 3’ and 3 = 3’ + 73’ are zero dimensional, meaning that g = g; is 
semisimple. Oo 


Recall (Definition 3.11) that a Lie algebra g is said to be simple if g has no 
nontrivial ideals and dim g is at least 2. That is to say, a one-dimensional Lie algebra 
is, by decree, not simple, even though it clearly has no nontrivial ideals. 


Theorem 7.8. Suppose that g is semisimple in the sense of Definition 7.1. Then g 
decomposes as a Lie algebra direct sum 


m 


s = aj. (7.4) 


j=l 


where each g; C g is a simple Lie algebra. 


We will see in Sect. 7.6 that most of our examples of semisimple Lie algebras are 
actually simple, meaning that there is only one term in the decomposition in (7.4). 
The converse of Theorem 7.8 is also true; if a complex Lie algebra g decomposes as 
a direct sum of simple algebras, then g is semisimple in the sense of Definition 7.1. 
(See Theorem 6.11 in [Kna2].) 


Proof. If g has a nontrivial ideal h, then by Proposition 7.5, g decomposes as the 
Lie algebra direct sum g = h ® ht, where b+ is also an ideal in g. Suppose that, say, 
h has a nontrivial ideal b’. Since every element of h commutes with every element 
of +, we see that ’ is actually an ideal in g. Thus, b” := (’)+ N b is an ideal in g 
and we can decompose g as h’ @ h” @ bt. Proceeding on in the same way, we can 
decompose g as a direct sum of ideals g;, where each g; has no nontrivial ideals. 
It remains only to show that each g; has dimension at least 2. Suppose, toward a 
contradiction, that dimg; = 1 for some j. Then g; is necessarily commutative, 
which means (since elements of g; commute with elements of gx for j # k) that 
g; is contained in the center Z(g) of g, contradicting the assumption that the center 
of g is trivial. o 


Proposition 7.9. If g is a complex semisimple Lie algebra, then the subalgebras gj; 
appearing in the decomposition (7.8) are unique up to order. 


To be more precise, suppose g is isomorphic to a Lie algebra direct sum of simple 
subalgebras g1,..., g; and also to a Lie algebra direct sum of simple subalgebras 
D1,- - -> Pm. Then the proposition asserts that each h; is actually equal to (not just 
isomorphic to) gx, for some k. The proposition depends crucially on the fact that the 
summands g; in (7.8) are simple (hence dimg; > 2) and not just that g; has no 
nontrivial ideals. Indeed, if g is a two-dimensional commutative Lie algebra, then 
g can be decomposed as a direct sum of one-dimensional commutative algebras in 
many different ways. 
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Proof. If h is an ideal in g, then § can be viewed as a representation of g by 
means of the map X +> (ady)|,. If we decompose g as the direct sum of simple 
subalgebra g1, ..., gm, then each g; is irreducible as a representation of g, since any 
invariant subspace would be an ideal in g and thus an ideal in g;. Furthermore, these 
representations are nonisomorphic, since action of g; on g; is nontrivial (since g; 
is not commutative), whereas the action of each gx, k # j, ong; is trivial. Suppose 
now that þh is an ideal in g that is simple as a Lie algebra and, thus, irreducible 
under the adjoint action of g. Now, for any j, the projection map x; : g > gj is 
an intertwining map of representations of g. Thus, for each j, the restriction of 7; 
to h must be either 0 or an isomorphism. There must be some j for which 7; b is 
nonzero and hence an isomorphism. But since the various g;’s are nonisomorphic 
representations of g, we must have ,|, = 0 for all k # j. Thus, actually, 


b= gj. z 


Before getting into the details of semisimple Lie algebras, let us briefly outline 
what our strategy will be in classifying their representations and what structures 
we will need to carry out this strategy. We will look for commuting elements 
Hı,..., H, in our Lie algebra that we will try to simultaneously diagonalize 
in each representation. We should find as many such elements as possible, and 
if they are going to be simultaneously diagonalizable in every representation, 
they must certainly be diagonalizable in the adjoint representation. This leads 
(in basis-independent language) to the definition of a Cartan subalgebra. The 
nonzero sets of simultaneous eigenvalues for ady,,...,ady, are called roots and 
the corresponding simultaneous eigenvectors are called root vectors. The root 
vectors will serve to raise and lower the eigenvalues of m(H),...,a(H,) in 
each representation x. We will also have the Weyl group, which is an important 
symmetry of the roots and also of the weights in each representation. 

One crucial part of the structure of semisimple Lie algebras is the existence of 
certain special subalgebras isomorphic to sI(2; C). Several times over the course of 
this chapter and the subsequent ones, we will make use of our knowledge of the 
representations of sl(2;C), applied to these subalgebras. If X, Y, and H are the 
usual basis elements for sI(2; C), then of particular importance is the fact that in any 
finite-dimensional representation x of sl(2;C) (not necessarily irreducible), every 
eigenvalue of z(H) must be an integer. (Compare Point 1 of Theorem 4.34 and 
Exercise 13 in Chapter 4.) 


7.2 Cartan Subalgebras 


Our first task is to identify certain special sorts of commutative subalgebras, called 
Cartan subalgebras. 


Definition 7.10. If g is a complex semisimple Lie algebra, then a Cartan sub- 
algebra of g is a complex subspace þh of g with the following properties: 
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1. For all Hı and A> in b, [H), M2] = 0. 
2. If, for some X € g, we have [H, X] = 0 forall H in b, then X is in b. 
3. For all H in b, ady is diagonalizable. 


Condition 1 says that h is a commutative subalgebra of g. Condition 2 says that b 
is a maximal commutative subalgebra (i.e., not contained in any larger commutative 
subalgebra). Condition 3 says that each ady (H € b) is diagonalizable. Since 
the H’s in h commute, the ady’s also commute, and thus they are simultaneously 
diagonalizable (Proposition A.16). 

Of course, the definition of a Cartan subalgebra makes sense in any Lie algebra, 
semisimple or not. However, if g is not semisimple, then g may not have any Cartan 
subalgebras in the sense of Definition 7.10; see Exercise 1. (Sometimes a different 
definition of “Cartan subalgebra” is used, one that allows every complex Lie algebra 
to have a Cartan subalgebra. This other definition is equivalent to Definition 7.10 
when g is semisimple but not in general.) Even in the semisimple case we must 
prove that a Cartan subalgebra exists. 


Proposition 7.11. Let g = tc be a complex semisimple Lie algebra and let t be any 
maximal commutative subalgebra of £. Define h C g by 


b=tc=t+it 


Then b is a Cartan subalgebra of g. 


Note that € (or any other finite-dimensional Lie algebra) contains a maximal 
commutative subalgebra. After all, any one-dimensional subalgebra tı of € is com- 
mutative. If ti is maximal, then we are done; if not, then we choose some 
commutative subalgebra tz properly containing t;, and so on. 


Proof of Proposition 7.11. It is clear that h is a commutative subalgebra of g. We 
must first show that h is maximal commutative. So, suppose that X € g commutes 
with every element of h, which certainly means that it commutes with every element 
of t. If we write X = X, + iX> with X; and X3 in £, then for H in t, we have 


[H, Xi + iX2] = [H, Xi] + i[H, X2] = 0, 


where [H, X 1] and [H, X2] are in €. However, since every element of g has a unique 
decomposition as an element of € plus an element of if, we see that [H, X] and 
[H, X2] must separately be zero. Since this holds for all H in tand since t is maximal 
commutative, we must have X, and X> in t, which means that X = X, + iX> is 
in h. This shows that is maximal commutative. 

If (-,-) is an inner product on g as in Proposition 7.4, then for each X in £, the 
operator adx : g — g is skew self-adjoint. In particular, each ady, H € t, is skew 
self-adjoint and thus diagonalizable (Theorem A.3). Finally, if H is any element of 
b = t + it, then H = H, + iH, with H, and A> in t. Since H; and H3 commute, 
ady, and adp, also commute and, thus, by Proposition A.16, ady, and adp, are 
simultaneously diagonalizable. It follows that ady is diagonalizable. o 
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Throughout this chapter and the subsequent chapters, we consider only Cartan 
subalgebras of the form h = tc, where t is a maximal commutative subalgebra 
of some compact real form € of g. It is true, but not obvious, that every Cartan 
subalgebra arises in this way. Indeed, for any two Cartan subalgebras hı and b2 of 
g, there exists an automorphism of g mapping h; to b2. (See Proposition 2.13 and 
Theorem 2.15 in Chapter II of [Kna2].) While we will not prove this result, we will 
prove in Chapter 11 that for a fixed £, the maximal commutative subalgebra t C € 
is unique up to an automorphism of €. (See Proposition 11.7 and Theorem 11.9.) 
In light of the uniqueness of Cartan subalgebras up to automorphism, the following 
definition is meaningful. 


Definition 7.12. If g is a complex semisimple Lie algebra, the rank of g is the 
dimension of any Cartan subalgebra. 


7.3 Roots and Root Spaces 


For the rest of this chapter, we assume that we have chosen a compact real form 
t of g and a maximal commutative subalgebra t of £, and we consider the Cartan 
subalgebra h = t + it. We assume also that we have chosen an inner product on g 
that is real on € and invariant under the adjoint action of K (Proposition 7.4). 

Since the operators ady, H € b, commute, and each such ady is diagonalizable, 
Proposition A.16 tell us that the ady’s with H € h are simultaneously diagonal- 
izable. If X € g is a simultaneous eigenvalue for each ady, H € b, then the 
corresponding eigenvalues depend linearly on H e€ J. If this linear functional is 
nonzero, we call it a root. As in Sect. 6.6, it is convenient to express this linear 
functional as H +> (a, H} for some a € b. The preceding discussion leads to the 
following definition. 


Definition 7.13. A nonzero element a of h is a root (for g relative to h) if there 
exists a nonzero X € g such that 


[H, X] = (a, H) X 


for all H in b. The set of all roots is denoted R. 


Note that we follow the convention that an inner product be linear in the second 
factor, so that (œ, H} depends linearly on H for a fixed a. 


Proposition 7.14. Each root æ belongs toit C b. 


Proof. As we have already noted, each ady, H € t, is a skew self-adjoint operator 
on þh, which means that ady has pure imaginary eigenvalues. It follows that if œ is 
a root, (œ, H) must be pure imaginary for H € t, which (since our inner product is 
real on t C £) can only happen if q is in it. o 
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Definition 7.15. If @ is a root, then the root space gq is the space of all X in g for 
which [H, X] = (a, H) X for all H in b. A nonzero element of gy is called a root 
vector for a. 

More generally, if œ is any element of h, we define gą to be the space of all X in 
g for which [H, X] = (a, H} X for all H in h (but we do not call gy a root space 
unless « is actually a root). 


Taking œ = 0, we see that go is the set of all elements of g that commute with 
every element of h. Since h is a maximal commutative subalgebra, we conclude that 
go = b. If @ is not zero and not a root, then gy = {0}. 

As we have noted, the operators ady, H € 6, are simultaneously diagonalizable. 
As a result, g can be decomposed as the sum of h and the root spaces ga. Actually, 
by Proposition A.17, the sum is direct and we have established the following result. 


Proposition 7.16. The Lie algebra g can be decomposed as a direct sum of vector 
spaces as follows: 


9=60 Da. 


aER 


That is to say, every element of g can be written uniquely as a sum of an element 
of h and one element from each root space g,. Note that the decomposition is 
not a Lie algebra direct sum, since, for example, elements of h do not, in general, 
commute with elements of gy. 


Proposition 7.17. For any a and B in b, we have 


[a, 96] C Ga+p- 


In particular, if X is in gy and Y is in g_y, then [X, Y] is in b. Furthermore, if X 
is in ga, Y is in gg, and œ + £ is neither zero nor a root, then [X, Y] = 0. 


Proof. It follows from the Jacobi identity that ady is a derivation: 
[H,[X, Y]] = [[H. X], Y] + [X, [H, Y]]. 
Thus, if X is in gy and Y is in gg, we have 
[H, [X, Y]] = [(a, H) X,Y] + [X, (6, H) Y] 
= (a + B, H) [X,Y]. 


for all H € h, showing that [X, Y] is in go+,. o 


Proposition 7.18. 7. Ifa € b is a root, so is —a. Specifically, if X is in gq, then 
X* is in g_a, where X* is defined by (7.2) in Proposition 7.4. 
2. The roots span b. 
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Proof. For Point 1, if X = X; + iX2 with X1, X2 E€ £, let 
X = X, — iX. (7.5) 
Since € is closed under brackets, if H € t C £ and X € g, we have 
[A, X] = [H, Xi] — i[H, Xo] = [H, X]. 


In particular, if X is a root vector with root a € it, then for all H € b, we have 


|H, X] = [H, X] = (a, H) X = —(a, H) X, (7.6) 


since (a, H) is pure imaginary for H € t. Extending (7.6) by linearity in H, we see 
that [H, X] = — (a, H) X for all H € b. Thus, X is a root vector corresponding to 
the root —a. It follows that X* = —X belongs to g_y. 

For Point 2, suppose that the roots did not span b. Then there would exist a 
nonzero H € h such that (a,H) = 0 for all a € R. Then we would have 
[H, H'] = 0 for all H’ in 6, and also 


[H, X] = (a, H) X =0 


for X in ga. Thus, by Proposition 7.16, H would be in the center of g, contradicting 
the definition of a semisimple Lie algebra. o 


We now develop a key tool in the study of a semisimple Lie algebra g, the 
existence of certain subalgebras of g isomorphic to sl(2; C). 


Theorem 7.19. For each root a, we can find linearly independent elements Xx in 
Ga, Yo in Ga, and Hy in h such that Ha is a multiple of & and such that 


[Ha, Xu] = 2Xa, 
[Ho Ya] = —2Yo, 
be Ya] = Ha. (7.7) 


Furthermore, Y can be chosen to equal X a : 


If Xa, Ya, Hy are as in the theorem, then on the one hand, [Hy, Xo] = 2Xq, 
while on the other hand, [Hx , Xa] = (a, Hy) Xa. Thus, 


(a, Xy) = 2. (7.8) 


Meanwhile, H, is a multiple of œ and the unique such multiple compatible 
with (7.8) is 


(7.9) 


For use in Part III, we record the following consequence of Theorem 7.19. 
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Corollary 7.20. For each root a, let Xy, Ya, and Hy be as in Theorem 7.19, with 
Y, = XJ. Then the elements 


Q i a i Q 1 
E} := 5Ha; E$ := (Xa+ Ya); E3 := (Ya Xa) 


are linearly independent elements of € and satisfy the commutation relations 
[E E3] = E3; [E7 E3] = Ey; [E3, Ey] = Ey. 


Thus, by Example 3.27, the span of Ey, E5, and EY is a subalgebra of £ isomorphic 
to su(2). 

Proof. Since a belongs to it and Hy is, by (7.9), a real multiple of a, the element 
(i/2) Hy will belong to t C €. Meanwhile, we may check that (E$)* = —E¥ and 
(ES)* = —E¥. From the way the map X + X* was defined (Proposition 7.4), 
it follows that £f and Es also belong to ¢. Furthermore, since Xe, Y,, and 
Ha are linearly independent, Ey, EY, and EY are also independent. Finally, 
direct computation with the commutation relations in (7.7) confirms the claimed 
commutation relations for the EF Ss. oO 


Definition 7.21. The element Hy = 2a/ (a#,a) in Theorem 7.19 is the coroot 
associated to the root a. 


We begin the proof of Theorem 7.19 with a lemma. 


Lemma 7.22. Suppose that X is in gq, that Y is in gg, and that H is in h. Then 
[X, Y] is in h and 


((X, Y], H) = (a, H) (Y, X*), (7.10) 


where X* is as in Proposition 7.4. 


In the proof of Theorem 7.19, we will need to know not just that [X, Y] € h, but 
where in h the element [X, Y] lies. This information is obtained by computing the 
inner product of [X, Y] with each element of H, as we have done in (7.10). 


Proof. That [X,Y] is in b follows from Proposition 7.17. Then, using Proposition 
7.4, we compute that 


(IX, Y], H) = (adx (Y), H) = (Y, adx+(H)) = — (Y, [H, X"). 7. 


Since X is in ga, Proposition 7.18 tells us that the element X* is in g—«, so that 
[H, X*] = —(a, H) X* and (7.11) reduces to the desired result. (Recall that we 
take inner products to be linear in the second factor.) o 


Proof of Theorem 7.19. Choose a nonzero X in gq, so that X* = —X is a nonzero 
element of g_y. Applying Lemma 7.22 with Y = X* gives 


(LX, X*], H) = (a, H) (X*, X*). (7.12) 
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From (7.12), we see that [X, X*] € h is orthogonal to every H € b that is orthogonal 
to a, which happens only if [X, X*] is a multiple of œ. Furthermore, if we choose 
H so that (a, H) # 0, we see that ([X, X*], H} 4 0, so that [X, X*] 4 0. Now, if 
we evaluate (7.12) with H = [X, X*], we obtain 


([X, X*], [X, X*]) = (œ, [X, X*]) (X", X"). 


Since [X, X*] 4 0, we conclude that (a, [X, X*]) is real and strictly positive. 
Let H = [X, X*] and define elements of g as follows: 


So 2 
a (a, H) E 
2 
Xo = e 


aedo F 
” \V (a, H) 


Then (a, Ha) = 2, from which it follows that [Hy, Xa] = 2Xq and [Ha, Ya] = 
—2Y,. Furthermore, [Xq, Yo] = 2[X,Y]/(a,H) = Ha. Thus, these elements 
satisfy the relations claimed in the theorem, and Y, = X;. Furthermore, since these 
elements are eigenvectors for ady with distinct eigenvalues 2, —2, and 0, they must 
be linearly independent (Proposition A.1). o 


We now make use of the subalgebras in Theorem 7.19 to obtain results about 
roots and root spaces. Note that 


s% := (Xa, Ya, Ha) (7.13) 


acts on g by (the restriction to s* of) the adjoint representation. 


Theorem 7.23. 1. For each root a, the only multiples of æ that are roots are a 
and —&. 
2. For each root a, the root space gq is one dimensional. 


Point 1 of Theorem 7.23 should be contrasted with the results of Exercise 8 in 
the case of a Lie algebra that is not semisimple. 


Lemma 7.24. Ifa and ca are both roots with |c| > 1, then c = +2. 


Proof. Let s* be as in (7.13). If 8 = ca is also a root and X is a nonzero element 
of gg, then by (7.8), we have 


|Ha, X] = (B, Hy) X = E (a, Hy) X = 22X. 


Thus, 2c is an eigenvalue for the adjoint action of s* on g. By Point | of 
Theorem 4.34, any such eigenvalue must be an integer, meaning that c is an integer 
multiple of 1/2. But by reversing the roles of œ and £ in the argument, we see that 
1/c must also be an integer multiple of 1/2. 
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Fig. 7.1 If 6 = 2a were a 
root, the orthogonal 
complement of $” in V® 
would contain an element of 
6 orthogonal to Hy 


Now, suppose c = n/2 for some integer n and that 1/c = 2/n is an integer 
multiple of 1/2, so that 2/c = 4/n is an integer. Then n = +1, +2, or +4. Thus, 
c = +1/2, +1, or +2, of which only +2 have |c| > 1. Oo 


Proof of Theorem 7.23. Since there are only finitely many roots, there is some 
smallest positive multiple of a that is a root. Replacing a by this smallest multiple, 
we can assume that if cw is a root, then |c| > 1. By Lemma 7.24, we must then have 
C= E2, 

Let s* be as in (7.13), with Y, chosen to equal Xž. Let V” be the subspace of 
g spanned by Hą and all the root spaces gg for which f is a multiple of œ. (See 
Figure 7.1.) We claim that V“ is a subalgebra of g. To verify this claim, observe first 
that if a root $ is a multiple of œ then by Lemma 7.22, every element of [g, g-g] 
is in § and is orthogonal to every element of þ that is orthogonal to 6. Thus, every 
element of [gg, gp] is a multiple of $, which is a multiple of œ, which is a multiple 
of Ha. Observe next that if X € gg, then [H«, X] is a multiple of X. Observe, finally, 
that if B and $’ are roots that are multiples of a with B’ 4 —f, then [gg, gg] C 
gg+p', where B + p’ Æ 0 is again a multiple of a. 

Since V“ is a subalgebra of g, it is certainly invariant under the adjoint action of 
s* C V”. Note also that 5” itself is an invariant subspace for the adjoint action of s* 
on V“. Now, since Yy = X% and H, is a positive multiple of œ € it C it, we see 
that X* € $“ for every X € s$”. Thus, by Propositions 7.4 and 4.27, the orthogonal 
complement U® of $” in V® will also be an invariant under the adjoint action of s*. 

Now, (a, Ha) = 2 and, by Lemma 7.24, any multiples 6 of «œ that are roots 
are of the form f = +a or p = +2a. Thus, the eigenvalues of ady, in V® are 
0, +2, and, possibly, +4. If U” # {0}, then U” will contain an eigenvector for 
ady,, with an eigenvalue A € {0, +2, +4}. Since À is even, it follows from Point 4 
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of Theorem 4.34 that 0 must also be an eigenvector for ady, in U%. But this is 
impossible, since Hy is the only eigenvector for ady, in V” with eigenvalue 0, and 
U” is orthogonal to Hy € 5”. Thus, actually, U” = {0}, which means that V” = s*. 
Thus, the only multiples of a that are roots are +a and the root spaces with roots 
+g are one dimensional. o 


Figure 7.1 shows a configuration of “roots” consistent with Lemma 7.24, but 
which cannot actually be the root system of a semisimple Lie algebra. 


7.4 The Weyl Group 


We now introduce an important symmetry of the set R of roots, known as the Wey] 
group. In this section, we follow the Lie algebra approach to the Weyl group, as 
opposed to the Lie group approach we followed in Sect. 6.6 in the SU(3) case. We 
will return to the Lie group approach to the Weyl group in Chapter 11 and show 
(Sect. 11.7) that the two approaches are equivalent. 


Definition 7.25. For each root a € R, define a linear map sy : b — 6 by the 
formula 


(a, H) 


S: H =H-2 
(œ, æ) 


(7.14) 


The Weyl group of R, denoted W, is then the subgroup of GL(h) generated by all 
the sy’s witha € R. 


Note that since each root @ is in it and our inner product is real on t, if H is in 
it, then sy - H is also in it. As a map of it to itself, sy is the reflection about the 
hyperplane orthogonal to a. That is to say, Sy + H = H whenever H is orthogonal 
to a, and Sy -a@ = —a. Since each reflection is an orthogonal linear transformation, 
we see that W is a subgroup of the orthogonal group O(i t). 


Theorem 7.26. The action of W on it preserves R. That is to say, if œ is a root, 
then w- a is a root for allw € W. 


Proof. For each œ € R, consider the invertible linear operator Sy on g given by 


Sa = e2txa p—adyy eda 


Now, if H € § satisfies (a, H) = 0, then [H, Xa] = (a, H} Xa = 0. Thus, H and 
Xa commute, which means that ady and ady, also commute, and similarly for ady 
and ady,. Thus, if (œ, H} = 0, the operator Se will commute with ady, so that 


SyadySy>'=ady, (a,H)=0. (7.15) 


7.5 Root Systems 183 


On the other hand, if we apply Point 3 of Theorem 4.34 to the adjoint action of 
s” œ sl(2; C) on g, we see that 


Syady, Sy! 


= —ady,. (7.16) 
By combining (7.15) and (7.16), we see that for all H € 6, we have 
Syady Sy! = ad,,.H- (7.17) 


Now if 6 is any root and X is an associated root vector, consider the vector 
SZ (X) € g. We compute that 
ady (Sg (X)) = Sg ' (Saadu Sa ')(X) 
= S; 'ads-g (X) 
= (b, Sa: H) Sg (X) 
= (sy! - B, H} Sz" (X). 
Thus, S7! (X) is a root vector with root sy! - B = Sa - B. This shows that the set of 
roots is invariant under each sy and, thus, under W. oO 


Actually, since sy * Sy - B = B, each reflection maps R onto R. It follows that 
each w € W also maps R onto R. 


Corollary 7.27. The Weyl group is finite. 


Proof. Since the roots span b, each w € W is determined by its action on R. Since, 
also, w maps R onto R, we see that W may be thought of as a subgroup of the 
permutation group on the roots. o 


7.5 Root Systems 


In this section, we record several important properties of the roots, using results from 
the two previous sections. Recall that for each root œ, we have an element Hy of b 
contained in [gy, gg] as in Theorem 7.19. As we saw in (7.8) and (7.9), H, satisfies 
(a, Hy) = 2 and is related to a by the formula Hy, = 2a/ (a, œ). In particular, the 
element H, is independent of the choice of X, and Y, in Theorem 7.19. 


Definition 7.28. For each root a, the element Hy € h given by 


is the coroot associated to the root a. 
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Proposition 7.29. For all roots œ and B, we have that 


(a, B) 


(œ, œ) 


(B, Ha) = 2 


(7.18) 


is an integer. 
We have actually already made use of this result in the proof of Lemma 7.24. 


Proof. If s* = (Xu, Ya, Ha) is as in Theorem 7.19 and X is a root vector associated 
to the root £, then [H,, X] = (8, Ha) X. Thus, (6, Ha) is an eigenvalue for the 
adjoint action of s* œ sl(2;C) on g. Point 1 of Theorem 4.34 then shows that 
(B, Hy) must be an integer. Oo 


Recall from elementary linear algebra that if œ and 6 are elements of an inner 
product space, the orthogonal projection of 6 onto @ is given by 


(a, b) 


(a, a) Q. 


The quantity on the right-hand side of (7.18) is thus twice the coefficient of a 
in the projection of 6 onto a. We may therefore interpret the integrality result in 
Proposition 7.29 in the following geometric way: 


If œ and £ are roots, the orthogonal projection of œ onto 6 must be an integer or half-integer 
multiple of £. 


Alternatively, we may think about Proposition 7.29 as saying that 6 and s, - 8 must 
differ by an integer multiple of æ [compare (7.14)]. 

If we think of the set R of roots as a subset of the real inner product space 
E := it, we may summarize the properties of R as follows. 


Theorem 7.30. The set R of roots is a finite set of nonzero elements of a real inner 
product space E, and R has the following additional properties. 


1. The roots span E. 

2. Ifa € R, then —a € R and the only multiples of a in R are a and —a. 

3. Ifa and B are in R, so is Sy + P, where Sa is the reflection defined by (7.14). 
4. For alla and $ in R, the quantity 


is an integer. 


Any such collection of vectors is called a root system. We will look in detail at 
the properties of root systems in Chapter 8. 


7.6 Simple Lie Algebras 185 
7.6 Simple Lie Algebras 


Every semisimple Lie algebra decomposes as a direct sum of simple algebras 
(Theorem 7.8). In this section, we give a criterion for a semisimple Lie algebra 
to be simple. We will eventually see (Sect. 8.11) that most of the familiar examples 
of semisimple Lie algebras are actually simple. 


Proposition 7.31. Suppose g is a real Lie algebra and that the complexification gc 
of g is simple. Then g is also simple. 


Proof. Since gc is simple, the dimension of gc over C is at least 2, so that the 
dimension of g over R is also at least 2. If g had a nontrivial ideal h, then the 
complexification hc of h would be a nontrivial ideal in g. oO 


The converse of Proposition 7.31 is false in general. The Lie algebra so(3; 1), for 
example, is simple as a real Lie algebra, and yet its complexification is isomorphic 
to so(4; C), which in turn is isomorphic to sl(2;C) @ sl(2; C). See Exercise 14. 


Theorem 7.32. Suppose K is a compact matrix Lie group whose Lie algebra € is 
simple as a real Lie algebra. Then the complexification g := £c of £ is simple as a 
complex Lie algebra. 


For more results about simple algebras over R, see Exercises 12 and 13. Before 
proving Theorem 7.32 result, we introduce a definition. 


Definition 7.33. If g is a real Lie algebra, g admits a complex structure if there 
exists a “multiplication by i” map J : g — g that makes g into a complex vector 
space in such a way that the bracket map [-, -] : g x g —> g is complex bilinear. 


Here, bilinearity of the bracket means, in particular, that [JX, Y] = J[X, Y] for 
all X,Y e€ t. Equivalently, g admits a complex structure if there exists a complex 
Lie algebra h and a real-linear map @ : h — g that is one-to-one and onto and 
satisfies b([X, Y]) = [6(X), d(Y)] for all X,Y € b. 


Lemma 7.34. Suppose that K is a compact matrix Lie group whose Lie algebra t 
is noncommutative. Then € does not admit a complex structure. 


Proof. Suppose, toward a contradiction, that € does admit a complex structure J. 
By Proposition 7.4, there exists a real inner product on € with respect to which ady 
is skew symmetric for all H e€ £. If we choose H not in the center of € then ady 
is nonzero and skew symmetric, hence diagonalizable in tc with pure-imaginary 
eigenvalues, not all of which are zero. In particular, ady is not nilpotent. 

On the other hand, since J is complex bilinear, if we view € as a complex vector 
space with respect to the map J, then ady is complex linear. Since ady is not 
nilpotent, it has a nonzero eigenvalue A = a + ib € C. Thus, there is a nonzero 
X € € such that 


[H, X] = (a + ib) - X = aX + bJX. 


186 7 Semisimple Lie Algebras 


If we then consider element 
H' := à - H =aH —bJH, 


we may compute, using the linearity of the bracket with respect to J and the identity 
J? = —1, that 


[H', X] = |A|? X = (a? + b3X. 
But ady is also skew symmetric, and, thus, 
JAI? (X, X) = (adu (X), X) = — (X, ady (X)) = - |A|? (X, X), 
which is impossible if A and X are both nonzero. o 


Proof of Theorem 7.32. Suppose to the contrary that g were not simple, so that it 
decomposes as a sum of at least two simple algebras g;. By Proposition 7.9, the 
decomposition of a semisimple algebra Lie algebra into a sum of simple algebras is 
unique up to ordering of the summands. On the other hand, g if decomposes as the 
sum of the g;,’s, it also decomposes as the sum of the g;’s, where the map X > X 
is as in (7.5). Thus, for each j, we must have g; = gx for some k. 

Suppose there is some j for which g; = g;. Then for all X € g;, the element 
X+X isin g; NË, from which it follows that g; N€ is a nonzero ideal in £. But g; N € 
cannot be all of £, or else g; would be all of g. Thus, g; N € would be a nontrivial 
ideal in €, contradicting our assumption that € is simple. On the other hand, if we 
pick some j with gj = gx, k A j, then (gj ® gx) N € is a nonzero ideal in €, which 
must be all of t. We conclude, then, that there must be exactly two summands gı 
and g2 in the decomposition of g, satisfying gı = gp. 

Let us then define a real-linear map ¢ : gı > € by ¢(X) = X + X. Note that 
for any X € gj, the element X is in gp, so that [X, X] = 0. From this, we can easily 
check that @ satisfies #([X, Y]) = [6(X), 6(Y)] for all X, Y € gı. Furthermore, @ 
is injective because, for any X € gı, we have X € gz and thus X cannot equal —X 
unless X = 0. Finally, by counting real dimensions, we see that ¢ is also surjective. 
Since gı is a complex Lie algebra and ¢ is a real Lie algebra isomorphism, we see 
that € admits a complex structure, contradicting Lemma 7.34. oO 


Theorem 7.35. Let g = tc be a complex semisimple Lie algebra, let t be a maximal 
commutative subalgebra of £, and let h = tc be the associated Cartan subalgebra 
of g. Let R C h be the root system for g relative to h. If g is not simple then h 
decomposes as an orthogonal direct sum of nonzero subspaces bı and h2 in such a 
way that every element of R is either in b; or in ho. Conversely, if ġ decomposes in 
this way, then g is not simple. 


We may restate the first part of the theorem in contrapositive form: If there does 
not exist an orthogonal decomposition of h as h = bı ® he (with dim; > 0 and 
dim h2 > 0) such that every root is either in hı or b2, then g is simple. In Sect. 8.6, 
we will show that if the “Dynkin diagram” of the root system is connected, then 
no such decomposition of h exists. We will then be able to check that most of our 
examples of semisimple Lie algebras are actually simple. 
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Proof of Theorem 7.35, Part 1. Assume first that g is not simple, so that, by 
Theorem 7.32, € is not simple either. Thus, € has a nontrivial ideal €;. Now, ideals 
in £ are precisely invariant subspaces for the adjoint action of € on itself, which are 
the same as the invariant subspaces for the adjoint action of K on €. If we choose 
an inner product on € is Ad- K -invariant, the orthogonal complement of €; is also an 
ideal. Thus, € decomposes as the Lie algebra direct sum €; ® €, where & = (€,)+, 
in which case g decomposes as gı ® g2, where gı = (€))c and go = (€2)c. 

Let t be any maximal commutative subalgebra of € and h = tc the associated 
Cartan subalgebra of g. We claim that t must decompose as tı ® t2, where tı C £ 
and t2 C &. Suppose H and H’ are two elements of t, with H = X; + X> and 
H' = Yı + Yz, with Xi, Yı E Si and Xa, Yz E b. Then 


0 = [H, H'] = [X;, Yi] + [X2, Yo], 


with [X,Y] € č; and [X2, Y2] € ť2, which can happen only if [X;, Yı] = 
[X>, Y>] =0. Thus, 


[Xı, H'] = [Xı, Yı] = 0, 


showing that X; commutes with every element of h. Since h is maximal commuta- 
tive, Xı must actually be in §, and similarly for X2. That is to say, for every X € t, 
the €;- and £#2-components of X also belong to t. From this observation, it follows 
easily that § is the direct sum of the subalgebras 


t = AE S t2 := tN b. 


The algebra h then splits as bı ® ho, where bı = (t))c and hz = (t2)c. Let Ry 
and R> be the roots for gı relative to hı and gp relative to b2, respectively. If œ is an 
element of Rı and X € gj is an associated root vector, then consider any element 
H = H, + H, of b, where Hy € hı and A> € ho. We have 


[H, X] = [Mi, X] + [H2, X] = (a, Hı) X, 


since H) € go and X € gı. Thus, X is also a root vector for g relative to h, with 
the associated root being a. Similarly, every root € Ro is also a root for g relative 
to h. 

We now claim that every root «œ for g relative to h is either an element of R, or 
an element of R2. If X = X, + X2 is a root vector associated to the root a, then for 
A, € hy we have 


[A,X] = [Ħ1, Xi + X2] = [A1, X1] = (a, Hı) X1. 


On the other hand, [H, X] must be a multiple of X. Thus, either X2 = 0 or a is 
orthogonal to hy. If œ is orthogonal to hı then it belongs to b2 and so @ is a root for 
Qo (i.e., œ € R2). If, on the other hand, X2 = 0, then 0 = [Ho, X] = (a, H2) X for 
all H2 € ho, so that œ is orthogonal to b2. In that case, œ € hı and œ must belong 
to Ri. oO 
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Proof of Theorem 7.35, Part 2. Suppose now b splits as h = bı © b2, with h; and 
h2 being nonzero, orthogonal subspaces of h, in such a way that every root is either 
in hy or b2. Let R; = RN bj, j = 1,2, and define subspaces g; of g as 


gj =b; @ Q au. j= 1,2. 


aeR; 


Then g decomposes as a vector space as gı ® g2. But hı commutes with each gy 
with œ € R3, because æ is in h2, which is orthogonal to bı. Similarly, h2 commutes 
with gy, œ € Rj, and, of course, with hy. Finally, if œ € Rı and $ € Ro, then 
[du g8] = {0}, because a + £ is not a root. Thus, actually, g is the Lie algebra 
direct sum of gı and g2, showing that g is not simple. o 


7.7 The Root Systems of the Classical Lie Algebras 


In this section, we look at how the structures described in this chapter work out in the 
case of the “classical” semisimple Lie algebras, that is, the special linear, orthogonal, 
and symplectic algebras over C. We label each of our Lie algebras in such a way 
that the rank is n, and we split our analysis of the orthogonal algebras into the even 
and odd cases. For each of the classical Lie algebras, we use a constant multiple of 
the Hilbert-Schmidt inner product (X, Y} = trace(X*Y), which is invariant under 
the adjoint action of the corresponding compact group. 


7.7.1 The Special Linear Algebras s\(n + 1; C) 


We work with the compact real form € = su(n + 1) and the commutative subalgebra 
t which is the intersection of the set of diagonal matrices with su(n + 1); that is, 


iai 
t= Bs a eR, @+++an41=0>7- (7.19) 
1dn+1 
We also consider h := tc, which is described as follows: 
Ay 
h= U A; EC, Ai teet Anp = 0 s (7.20) 


An+1 
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If a matrix X commutes with each element of t, it will also commute with each 
element of h. But then, as in the proof of Example 7.3, X would have to be diagonal, 
and if X € su(n + 1), then X would have to be in t. Thus, t is actually a maximal 
commutative subalgebra of Su(n + 1). 

Now, let Ej, denote the matrix that has a one in the kth row and /th column and 
that has zeros elsewhere. A simple calculation shows that if H € b is as in (7.20), 
then HE; = A; Ej and EH = A, Ejk. Thus, 


[H, Ex] = (Aj — An) Ez- (7.21) 


If j # k, then Ex is in sl(n + 1;C) and (7.21) shows that E¥, is a simultaneous 
eigenvector for each ady with H in b, with eigenvalue A; — Àx. Note that every 
element X of sl(n + 1;C) can be written uniquely as an element of the Cartan 
subalgebra (the diagonal entries of X) plus a linear combination of the Ex’s with 
j # k (the off-diagonal entries of X). 

If we think at first of the roots as elements of h*, then [according to (7.21)] 
the roots are the linear functionals a, that associate to each H € 6, as in (7.20), the 
quantity À ; — Ax. We identify h with the subspace of C’*! consisting vectors whose 
components sum to zero. The inner product (X, Y} = trace(X*Y) on b is just the 
restriction to this subspace of the usual inner product on C”*!. If we use this inner 
product to transfer the roots from h* to h, we obtain the vectors 


Oe =ej;—e (J #K). 


The roots of sl(n + 1; C) form a root system that is conventionally called A,,, 
with the subscript n indicating that the rank of sl(n + 1; C) (i.e., the dimension of h) 
is n. We see that each root has length /2 and 


(ar aje) 


has the value 0, +1, or +2, depending on whether { j, k} and { j’, k’} have zero, one, 
or two elements in common. Thus 


eaz) e (0, £1, +2}. 
(ojt, x) 
If a and ĝ are roots anda # f anda # —f, then the angle between a and £ is 
either 2/3, 2/2, or 27/3, depending on whether (a, 6) has the value 1, 0, or —1. 
It is easy to see that for any j and k, the reflection Se, acts on ce! by 
interchanging the jth and kth entries of each vector. It follows that the Weyl group 
of the A, root system is the permutation group on n + 1 elements. 
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7.7.2 The Orthogonal Algebras so(2n; C) 


The root system for so(2n; C) is denoted D,,. We consider the compact real form 
so(2n) of so(2n; C), and we consider in so(2n) the commutative subalgebra t 
consisting of 2 x 2 block-diagonal matrices in which the jth diagonal block is of 


the form 
0 aj 
7.22 
E a | 7.22) 


for some a; € R. We then consider h = t + it of So(2n; C), which consists of 2 x 2 
block-diagonal matrices in which the jth diagonal block is of the form (7.22) with 
a; € C. As we will show below, t is actually a maximal commutative subalgebra of 
so(2n), so that h is a Cartan subalgebra of So(2n; C). 

The root vectors are now 2 x 2 block matrices having a 2 x 2 matrix C in the 
(j,k) block (j < k), the matrix —C” in the (k, j) block, and zero in all other 
blocks, where C is one of the four matrices 


Direct calculation shows that these are, indeed, root vectors and that the correspond- 
ing roots are the linear functionals on h given by i (a; +ax), —i (a; +ax),i(aj—ax), 
and —i(a; — ax), respectively. 

Let us use on b the inner product (X, Y) = trace(X* Y)/2. If we then identify b 
with C” by means of the map 


H |> i(a,...,dn), 


our inner product on h will correspond to the standard inner product on C”. The 
roots as elements of C” are then the vectors 


tejte, j<k, (7.23) 


where {e ; } is the standard basis for R”. 

We now demonstrate that t is a maximal commutative subalgebra of €, and also 
that the center of so(2n; C) is trivial for n > 2, as claimed in Example 7.3. It 
is easy to check that every X € g can be expresses as an element of h plus a 
linear combination of the root vectors described above. If X commutes with every 
element of t, then X also commutes with every element of h. Since each of the linear 
functionals i (+a; tax), j < k, is nonzero on b, the coefficients of the root vectors 
in the expansion of X must be zero; that is, X must be in b. If, also, X is in €, then 
X must be in h N £ = t. This shows that t is maximal commutative in €. 
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Meanwhile, if X is in the center of so(2n; C), then as shown in the previous 
paragraph, X must belong to b. But then for each root vector Xx with root œ, we 
must have 


0 = [X, Xa] = (a, X) Xa, 


so that (a, X) = 0. It is then easy to check that if n > 2, the roots in (7.23) span 
h = C” and we conclude that X must be zero. (If, on the other hand, n = 1, then 
there are no roots and so(2; C) = h is commutative.) 


7.7.3 The Orthogonal Algebras so(2n + 1; C) 


The root system for so(2n + 1; C) is denoted B,,. We consider the compact real 
form so(2n + 1) of So(2n + 1; C), and we consider in So(2n + 1) the commutative 
subalgebra t consisting of block diagonal matrices with n blocks of size 2 x 2 
followed by one block of size 1 x 1. We take the 2 x 2 blocks to be of the same 
form as in SO(27) and we take the 1 x 1 block to be zero. Then h := te consists of 
those matrices in so(2n + 1; C) of the same form as in t except that the off-diagonal 
elements of the 2 x 2 blocks are permitted to be complex. The same argument as 
in the case of SO(2n; C), based on the calculations below, shows that t is maximal 
commutative, so that h is a Cartan subalgebra. 

The Cartan subalgebra in SO(2 + 1; C) is identifiable in an obvious way with 
the Cartan subalgebra in So(2n; C). In particular, both so(2n; C) and so(2n + 1; C) 
have rank n. With this identification of the Cartan subalgebras, every root for 
so(2n; C) is also a root for so(2n + 1;C). There are 2n additional roots for 
so(2n + 1; C). These have the matrices having the following 2 x 1 block in entries 
(2k,2n + 1) and (2k + 1,2n + 1) as their root vectors: 


-() 


and having — Bj’ in entries (2n + 1,2k) and (2n + 1,2k + 1), together with the 


matrices having 
—i 


in entries (2k, 2n + 1) and (2k + 1,2n + 1) and — BY in entries (2n + 1, 2k) and 
(2n + 1,2k + 1). The corresponding roots, viewed as elements of h*, are given by 
ia, and —ia,. 
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If we use the inner product to identify the roots with elements of h and then 
identify h with C” in the same way as in the previous subsection, the roots for 
so(2n + 1; C) consist of the roots te; + ex, j < k, for so(2n; C), together with 
additional roots of the form 


tej, jo=l,...,n. 


These additional roots are shorter by a factor of /2 than the roots +e j x ex for 
so(2n; C). 


7.7.4 The Symplectic Algebras Sp(n; C) 


The root system for sp(n; C) is denoted C. We consider sp(n; C), the space of 
2n x 2n complex matrices X satisfying QX"Q = X, where Q is the 2n x 2n 


matrix 
a= ( ah 
-I 0 


(Compare Proposition 3.25.) Explicitly, the elements of sp(n; C) are matrices of the 


form 
A B 
a i 


where A is an arbitrary n x n matrix and B and C are arbitrary symmetric 
matrices. We consider the compact real form sp(n) = sp(n; C) N u(2n). (Compare 
Sect. 1.2.8.) 

We consider the commutative subalgebra t of sp(n) consisting of matrices of the 
form 


ay 


where each a; is pure imaginary. We then consider the subalgebra h = t + it of 
sp(n; C), which consists of matrices of the same form but where each a; is now an 
arbitrary complex number. As in previous subsections, the calculations below will 
show that t is maximal commutative, so that h is a Cartan subalgebra. 
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The 2n x 2n matrices of the block form 


0 Ez + Exj 0 0 (7.24) 
0 0 ; Ejk + Ey 0 ` 


(j Æ k) are root vectors for which the corresponding roots are (a; + ag) and 
—(a; + ag), respectively. Next, matrices of the block form 


Ex 0 
( : sa) (7.25) 


(j Æ k) are root vectors for which the corresponding roots are (ax — a). Finally, 


matrices of the block form 
0 Ej; 0 0 
; 7.26 
G 0 ) e J i ) 


are root vectors for which the corresponding roots are 2a; and —2a;. 
We use on § the inner product (X,Y) = trace(X*Y)/2. If we then identify b 
with C” by means of the map 


H |> (qj,...,d), 


our inner product on h will correspond to the standard inner product on C”. The 
roots are then the vectors of the form 


tej Eer j<k 
and of the form 
+2ej, j=l,...,n. 


This root system is the same as that for so(2n + 1; C), except that instead of +e; 
we now have +2e;. 


7.8 Exercises 


1. Let hc denote the complexification of the Lie algebra of the Heisenberg group, 
namely the space of all complex 3 x 3 upper triangular matrices with zeros on 
the diagonal. 


(a) Show the maximal commutative subalgebras of hc are precisely the two- 
dimensional subalgebras of hc that contain the center of hc. 

(b) Show that hc does not have any Cartan subalgebras (in the sense of 
Definition 7.10). 
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2. 


3. 


4. 


R 


8. 
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Give an example of a maximal commutative subalgebra of sl(2; C) that is not a 
Cartan subalgebra. 

Show that the Hilbert-Schmidt inner product on sl(n; C), given by (X,Y) = 
trace(X*Y ), is invariant under the adjoint action of SU(n) and is real on Su(n). 
Using the root space decomposition in Sect. 7.7.2, show that the Lie algebra 
so(4; C) is isomorphic to the Lie algebra direct sum sl(2;C) @® sI(2; C). Then 
show that so(4) is isomorphic to su(2) ® su(2). 


. Suppose g is a Lie subalgebra of M,,(C) and assume that for all X € g, we have 


X* € g, where X* is the usual matrix adjoint of X. Let £ = g N u(n). 


(a) Show that € is a real subalgebra of g and that g = fc. 
(b) Define an inner product on g by 


(X,Y) = trace(X*Y). 


Show that this inner product satisfies all the properties in Proposition 7.4. 


. For any Lie algebra g, let the Killing form be the symmetric bilinear form B 


on g defined by 
B(X, Y) = trace(adyady). 


(a) Show that for any X € g, the operator ady : g — g is skew-symmetric 
with respect to B, meaning that 


B(ady(Y), Z) = —B(Y, adx (Z)) 


forall Y, Z € g. 
(b) Suppose g = Ëc is semisimple. Show that B is nondegenerate, meaning 
that for all nonzero X € g, there is some Y € g for which B(X, Y) 4 0. 


Hint: Use Proposition 7.4. 
Let g be a Lie algebra and let B be the Killing form on g (Exercise 6). Show 
that if g is a nilpotent Lie algebra (Definition 3.15), then B(X, Y) = 0 for all 
X,Y €g. 
Let g denote the vector space of 3 x 3 complex matrices of the form 


AB 
007’ 
where A is a 2 x 2 matrix with trace zero and B is an arbitrary 2 x 1 matrix. 


(a) Show that g is a subalgebra of M3 (C). 

(b) Let X, Y, H, ei, and ez be the following basis for g. We let X, Y, and 
H be the usual sl(2; C) basis in the “A” slot, with B = 0. We let e; and 
ez be the matrices with A = 0 and with (1 0)” and (0 1)” in the “B” 


7.8 Exercises 195 


10. 


11. 


12. 


slot, respectively. Compute the commutation relations among these basis 

elements. 

Show that g has precisely one nontrivial ideal, namely the span of e; and 

e2. Conclude that g is not semisimple. 

Hint: First, determine the subspaces of g that are invariant under the adjoint 

action of the sl(2; C) algebra spanned by X, Y , and H, and then determine 

which of these subspaces are also invariant under the adjoint action of e1 
and e2. In determining the sl(2; C)-invariant subspaces, use Exercise 10 of 

Chapter 4. 

(d) Let h be the one-dimensional subspace of g spanned by the element H. 
Show that h is a maximal commutative subalgebra of g and that ady is 
diagonalizable. Show that the eigenvalues of ady are 0, +1, and +2. 
Note: The “roots” for h are thus the numbers +1 and +2, which would 
not be possible if h were a Cartan subalgebra of the form h = tc ina 
semisimple Lie algebra. 

(e) Let gı and g_; denote the eigenspaces of ady with eigenvalues 1 and —1, 
respectively. Show that [g:, g-1] = {0}. 

Note: This result should be contrasted with the semisimple case, where 
[Sa,9—q] is always a one-dimensional subspace of b, so that ge, g—a, and 
[dv, §—w] span a three-dimensional subalgebra of g isomorphic to sl(2; C). 


(c 


wa 


. Using Theorem 7.35 and the calculations in Sect. 7.7.3, show that the Lie 


algebra so(5; C) is simple. 

Let E = R",n > 2, and consider the D, root system, consisting of the vectors 
of the form +e; +e,, with j < k. Show that the Weyl group of D, is the group 
of transformations of R” expressible as a composition of a permutation of the 
entries and an even number of sign changes. 


Let E = R” and consider the B, root system, consisting of the vectors of 
the form +e; + eg, with j < k, together with the vectors of the form ej, 
j =1,...,n. Show that the Weyl group of B, is the group of transformations 


expressible as a composition of a permutation of the entries and an arbitrary 
number of sign changes. 

Note: Since each root in C, is a multiple of a root in B,, and vice versa, the 
reflections for C, are the same as the reflections for B,,. Thus, the Weyl group 
of C, is the same as that of B,. 

Let g be a complex simple Lie algebra, with complex structure denoted by 
J. Let gr denote the Lie algebra g viewed as a real Lie algebra of twice the 
dimension. Now let g’ be the complexification of gg, with the complex structure 
on g’ denoted by i. 


(a) Show that g’ decomposes as a Lie algebra direct sum g’ = gı ® g2, where 
both gı and gz are isomorphic to g. 
Hint: Consider element of g’ of the form X + iJX and of the form X — iJX, 
with X € g. 
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13. 


14. 
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(b) Show that gp is simple as a real Lie algebra. (That is to say, there is no 
nontrivial real subspace h of g such that [X,Y] € b for all X € g and 
Y € 6.) 


Let h be a real simple Lie algebra, and assume that hc is not simple. Show that 
there is a complex simple Lie algebra g such that h = gp, where the notation is 
as in Exercise 12. 

Hint: Imitate the proof of Theorem 7.32. 

Show that the real Lie algebra so(3; 1) is isomorphic to sl(2;C)p, where the 
notation is as in Exercise 12. Conclude that so(3; 1) is simple as a real Lie 
algebra, but that so(3; 1)c is not simple and is isomorphic to sl(2; C) @sl(2; ©). 
Hint: First show that so(3; 1)c = sl(2;C) @ sl(2;C) and then show that the 
two copies of sI(2; C) are conjugates of each other. 


Chapter 8 
Root Systems 


In this chapter, we consider root systems apart from their origins in semisimple 
Lie algebras. We establish numerous “factoids” about root systems, which will 
be used extensively in subsequent chapters. Here is one example of how results 
about root systems will be used. In Chapter 9, we construct each finite-dimensional, 
irreducible representation of a semisimple Lie algebra as a quotient of an infinite- 
dimensional representation known as a Verma module. To prove that the quotient 
representation is finite-dimensional, we prove that the weights of the quotient are 
invariant under the action of the Weyl group, that is, the group generated by the 
reflections about the hyperplanes orthogonal to the roots. It is not possible, however, 
to prove directly that the weights are invariant under all reflections, but only under 
reflections coming from a special subset of the root system known as the base. To 
complete the argument, then, we need to know that the Weyl group is actually 
generated by the reflections associated to the roots in the base. This claim is the 
content of Proposition 8.24. 


8.1 Abstract Root Systems 


A root system is any collection of vectors having the properties satisfied by the 
roots (viewed as a subset of it C b) of a semisimple Lie algebra, as encoded in the 
following definition. 


Definition 8.1. A root system (Æ, R) is a finite-dimensional real vector space E 
with an inner product (-,-), together with a finite collection R of nonzero vectors in 
E satisfying the following properties: 


1. The vectors in R span EF. 
2. If« isin R andc € R, then ca is in R only if c = +1. 
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3. If «œ and £ are in R, then so is Sy - 8, where Sq is the linear transformation of E 
defined by 


(B, a) 


So: B = b -2———aa, PEE. 
(œ, a) 
4. For all a and £ in R, the quantity 
„B0 
(æ, æ) 


is an integer. 


The dimension of Æ is called the rank of the root system and the elements of R 
are called roots. 


Note that since sy-a@ = —a, we have that —œ € R whenever « € R. In the theory 
of symmetric spaces, there arise systems satisfying Conditions 1, 3, and 4, but not 
Condition 2. These are called “nonreduced” root systems. In the theory of Coxeter 
groups, there arise systems satisfying Conditions 1, 2, and 3, but not Property 4. 
These are called “noncrystallographic” or “nonintegral” root systems. In this book, 
we consider only root systems satisfying all of the conditions in Definition 8.1. 

The map Sy is the reflection about the hyperplane orthogonal to œ; that is, S +œ = 
—a and sy - Ê = 6 for all 6 that are orthogonal to «œ, as is easily verified from the 
formula for sy. From this description, it should be evident that sy is an orthogonal 
transformation of E with determinant —1. 

We can interpret Property 4 geometrically in one of two ways. In light of the 
formula for sy, Property 4 is equivalent to saying that s, - B should differ from 6 
by an integer multiple of æ. Alternatively, if we recall that the orthogonal projection 
of 6 onto a is given by ((6, «)/(a,a@))a, we note that the quantity in Property 4 is 
twice the coefficient of a in this projection. Thus, Property 4 is equivalent to saying 
that the projection of B onto « is an integer or half-integer multiple of a. 

We have shown that one can associate a root system to every complex semisimple 
Lie algebra. It turns out that every root system arises in this way, although this is far 
from obvious—see Sect. 8.11. 


Definition 8.2. If (E, R) is a root system, the Weyl group W of R is the subgroup 
of the orthogonal group of E generated by the reflections sy, a € R. 


By assumption, each sy maps R into itself, indeed onto itself, since each B € R 
satisfies B = Sq - (Sq - B). It follows that every element of W maps R onto itself. 
Since the roots span Æ, a linear transformation of E is determined by its action 
on R. Thus, the Weyl group is a finite subgroup of O(£) and may be regarded as a 
subgroup of the permutation group on R. We denote the action of w € W on H € E 
byw- H. 
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Proposition 8.3. Suppose (E, R) and (F, S) are root systems. Consider the vector 
space E ® F, with the natural inner product determined by the inner products on 
E and F. Then RU S is a root system in E ® F, called the direct sum of R and S. 


Here, we are identifying E with the subspace of E ® F consisting of all vectors 
of the form (e,0) with e in E, and similarly for F. Thus, more precisely, R U S 
means the elements of the form (a, 0) with œ in R together with elements of the 
form (0, £) with £ in S. (Elements of the form (@, p) witha € R and £ € S are 
notin RU S.) 


Proof. If R spans E and S spans F, then R U S spans E @ F, so Condition 1 
is satisfied. Condition 2 holds because R and S are root systems in E and F, 
respectively. For Condition 3, if œ and £ are both in R or both in S, then sy - € 
RUS because R and S are root systems. If œ € R and f € S or vice versa, then 
(a, B) = 0, so that 


Sa: B=BeERUS. 


Similarly, if œ and 6 are both in R or both in S, then 2 (a, B) / (a, œ) is an integer 
because R and S are root systems, and if a € R and B € S or vice versa, then 
2 (a, B) / (a, a) = 0. Thus, Condition 4 holds for RU S. o 


Definition 8.4. A root system (E, R) is called reducible if there exists an orthogo- 
nal decomposition E = E; @ FE» with dim E; > 0 and dim £2 > 0 such that every 
element of R is either in E; or in E2. If no such decomposition exists, (E£, R) is 
called irreducible. 


If (E, R) is reducible, then it is not hard to see that the part of R in Æ; is a root 
system in Æ; and the part of R in E; is a root system in Æ%2. Thus, a root system is 
reducible precisely if it can be realized as a direct sum of two other root systems. 
In the Lie algebra setting, the root system associated to a complex semisimple Lie 
algebra g is irreducible precisely if g is simple (Theorem 7.35). 


Definition 8.5. Two root systems (E, R) and (F, S) are said to be isomorphic if 
there exists an invertible linear transformation A : E —> F such that A maps R onto 
S and such that for all œ € R and € E, we have 


A (Sq i B) = SAa* (AB). 


A map A with this property is called an isomorphism. 


Note that the linear map A is not required to preserve inner products, but only to 
preserve the reflections about the roots. If, for example, F = E and S consists of 
elements of the form cæ with a € R, then (F, S) is isomorphic to (E, R), with the 
isomorphism being the map A = cl. 

We now establish a basic result limiting the possible angles and length ratios 
occurring in a root system. 
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Fig. 8.1 The basic acute angles and length ratios 


Proposition 8.6. Suppose a and B are roots, a is not a multiple of B, and (a, a) > 
(B, B). Then one of the following holds: 


Figure 8.1 shows the allowed angles and length ratios, for the case of an acute 


angle. In each case, 2 (a, B)/(a,a@) = 1, whereas 2 ($, a)/(B, B) takes the values 
1, 2, and 3 in the three successive cases. Section 8.2 shows that each of the angles 
and length ratios permitted by Proposition 8.6 actually occurs in some root system. 


Proof. Suppose that œ and 6 are roots and let my = 2(a, B)/(a,a) and m2. = 


2(B,a)/(B, B), so that mı and mz are integers. Assume (a@,a@) > (p, 8) and note 
that 


(a, B)? 
(a, a) (P, B) 


where @ is the angle between « and £p, and that 


mim, = 4 = Acos” 0, (8.1) 


m, _ (a, a) 5 
m BAS. i 
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I 
j 
I 
Fig. 8.2 The projection of onto œ equals œ/2 and sy - B equals B — a 


whenever (a, 6) # 0. From (8.1), we conclude that 0 < mim < 4. If mym2 = 0, 
then cos 0 = 0, so œ and ĝ are orthogonal. If mim = 4, then cos? 0 = 1, which 
means that œ and 6 are multiples of one another. 


The remaining possible values for m mz are 1, 2, and 3. If mımı = 1, then 
cos? 0 = 1/4, so 0 is 2/3 or 27/3. Since m; and m3 are both integers, we must 
have mı = 1 and mọ = 1 orm, = —1 and m2 = —1. In the first case, (a, B) > 0 


and we have 0 = 7/3 and in the second case, (a, 6} < 0 and we have 6 = 27/3. 
In either case, (8.2) tells us that œ and f have the same length, establishing Case 2 
of the proposition. 

If mım = 2, then cos” 6 = 1/2, so that 6 is 2/4 or 37/4. Since mı and m3 are 
integers and |m2| > |m| by (8.2), we must have mı = 1 and m2 = 2 or mı = — 


and mz = —2. In the first case, we have 9 = 2/4 and in the second case, 0 = 37/4. 
In either case, (8.2) tells us that œ is longer than £ by a factor of /2. The analysis 
of the case m,m 7 = 3 is similar. oO 


Corollary 8.7. Suppose a and $ are roots. If the angle between a and $ is strictly 
obtuse (i.e., strictly between 1/2 and r), thena + p is a root. If the angle between 
a and p is strictly acute (i.e., strictly between 0 and 1/2), then a — p and p —a are 
also roots. 


See Figure 8.2. 


Proof. The proof is by examining each of the three obtuse angles and each of the 
three acute angles allowed by Proposition 8.6. Consider first the acute case and 
adjust the labeling so that (œ, œ} > (B, B). An examination of Cases 2, 3, and 4 in 
the proposition (see Figure 8.1) shows that in each of these cases, the projection of 
B onto @ is equal to a/2. Thus, sy - 6 = B — a is again a root—and so, therefore, 
is —(8 —a) = a — B. In the obtuse case (with (a, a) > (6, B)), the projection of 6 
onto a equals —a/2, and, thus, Sy - 6 = a+ is again a root. Oo 


8.2 Examples in Rank Two 


If the rank of the root system is one, then there is only one possibility: R must 
consist of a pair {œ, —a}, where « is a nonzero element of E. In rank two, there are 
four possibilities, pictured in Figure 8.3 with their conventional names. In the case 
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A, XA, A, 


v y 


Fig. 8.3 The rank-two root systems 


of A; x Aj, the lengths of the horizontal roots are unrelated to the lengths of the 
vertical roots. In A», all roots have the same length; in B2, the length of the longer 
roots is ./2 times the length of the shorter roots; in G2, the length of the longer roots 
is /3 times the length of the shorter roots. The angle between successive roots is 
zx/2 for A; x Ay, 2/3 for Az, 2/4 for B2, and 2/6 for G2. It is easy to check 
that each of these systems is actually a root system; in particular, Proposition 8.6 is 
satisfied for each pair of roots. 


Proposition 8.8. Every rank-two root system is isomorphic to one of the systems in 
Figure 8.3. 


Proof. It is harmless to assume E = R?; thus, let R C R? be a root system. Let 0 
be the smallest angle occurring between any two vectors in R. Since the elements of 
R span R?, we can find two linearly independent vectors œ and £ in R. If the angle 
between « and is greater than 7/2, then the angle between «œ and —f is less than 
x /2; thus, the minimum angle 6 is at most 2/2. Then, according to Proposition 8.6, 
0 must be one of the following: 2/2, 1/3, 1/4, 1/6. 

Let a and f be two elements of R such that the angle between them is the 
minimum angle 0. Then the vector —sg - œ will be a vector that is at angle 6 to 
B but on the opposite side of £ from a, as shown in Figure 8.4. Thus, —sg - @ is at 
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Fig. 8.4 The root —sg - a is -spa 
at angle 20 from a 


angle 20 to a. Similarly, —55,.. - Ê is at angle 30 to œ. Continuing in the same way, 
we can obtain vectors at angle n0 to œ for all n. These vectors are unique since a 
nontrivial positive multiple of a root is not allowed to be a root. Now, since each 
of the allowed values of 0 evenly divides 277, we will eventually come around to a 
again. Furthermore, there cannot be any other vectors besides those at angles n0 to 
a, or else there would be an angle smaller than 6. 

Thus, R must consist of n equally spaced vectors, with consecutive vectors 
separated by angle 0, where @ is one of the acute angles in Proposition 8.6. If, 
say, 0 = 7/4, then in order to satisfy the length requirement in Proposition 8.6, the 
roots must alternate between a shorter length and a second length that is greater by 
a factor of /2. Thus, our root system must be isomorphic to B2. Similar reasoning 
shows that all remaining values of 0 yield one of the root systems in Figure 8.3. O 


Using the results of Sect. 7.7, we can see that the root systems A, x A1, A2, and 
By arise as root systems of classical Lie algebras as follows. The root system A, x A; 
is the root system of so(4; C), which is isomorphic to sl(2;C) @ sl(2; C); A2 is the 
root system of sl(3;C); and Bz is the root system of so(5; C), which is isomorphic 
to the root system of sp(2; C). The root system G2, meanwhile, is the root system 
of an “exceptional” Lie algebra, which is also referred to as Go (Sect. 8.11). 


Proposition 8.9. If R is a rank-two root system with minimum angle 0 = 21/n, 
n = 4,6,8,12, then the Weyl group of R is the symmetry group of a regular 
n/2-gon. 


Proof. The group W will contain at least n/2 reflections, one for each pair ta of 
roots. If œ and $ are roots with some angle ¢ between them, then the composition 
of the reflections sy and sg will be a rotation by angle +2¢, with the direction of 
the rotation depending on the order of the composition. To see this, note that sy 
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and sg both have determinant —1 and so s,sg has determinant 1 and is, therefore, 
a rotation by some angle. To determine the angle, it suffices to apply sgsy to any 
nonzero vector, for example, œ. However, Sy : œ = —a and sg - (—@) is at angle 2¢ 
to æ, as in Figure 8.4. 

Now, since ¢ can be any integer multiple of 0, we obtain all rotations by integer 
multiples of 20. Meanwhile, the composition of a reflection sy and a rotation by 
angle 2¢ will be another reflection sg, where is a root at angle ¢ to œ, as the 
reader may verify. Thus, the set of n/2 reflections together with rotations by integer 
multiples of 20 form a group; this is the Weyl group of the rank-two root system. 
But this group is also the symmetry group of a regular n/2-gon, also known as the 
dihedral group on n/2 elements. Oo 


Note that in the case of A2, the Weyl group consists of three reflections together 
with three rotations (by multiples of 27/3). In this case, the Weyl group is not the 
full symmetry group of the root system: Rotations by 2/3 map R onto itself but are 
not elements of the Weyl group. 


8.3 Duality 


In this section, we introduce an important duality operation on root systems. 


Definition 8.10. If (E, R) is a root system, then for each root a € R, the coroot 
H,, is the vector given by 


The set of all coroots is denoted RY and is called the dual root system to R. 


This definition is consistent with the use of the term “coroot” in Chapter 7; see 
Definition 7.28. Point 4 in the definition of a root system may be restated as saying 
that (8, Ha) should be an integer for all roots œ and £. 


Proposition 8.11. If R is a root system, then RY is also a root system and the Weyl 
group for RY is the same as the Weyl group for R. Furthermore, (RY)Y = R. 


Proof. We compute that 


and, therefore, 


2 e -2 ()E2-e (8.3) 


8.3 Duality 205 


If we take the inner product of (8.3) with Hg, we see that 


(Ha, Hp) 
(Ha, Ha) 


(a, b) 
(B. B) 


which means that the left-hand side of (8.4) is an integer. 
Furthermore, since Hg is a multiple of q, it is evident that sy = sy,. Since, also, 
Sq is an orthogonal transformation, we have 


MF Sa: Ê E Sa: Ê = 
ey e ee 


Thus, the set of coroots is invariant under each reflection sy (= sy,). This 
observation, together with (8.4), shows that RY is again a root system, with the 
remaining properties of root systems for RY following immediately from the 
corresponding properties of R. 

Since sy = SH,, we see that R and RY have the same Weyl group. Finally, (8.3) 
says that the formula for w in terms of H, is the same as the formula for H, in terms 
of a. Thus, Hy, = a and (RY)Y = R. o 


2 


= (a, Hs) = 2 (8.4) 


Note from (8.4) that the integer associated to the pair (H,, Hg) in RY is the same 
as the integer associated to the pair (£, œ) (not (a, ß)) in R. If all the roots in R 
have the same length, RY is isomorphic to R. Even if not all the roots have the same 
length, R and RY could be isomorphic. In the case of B2, for example, the dual 
root system RY can be converted back to R by a 7/4 rotation and a scaling. (See 
Figure 8.5.) On the other hand, the rank-three root systems B} and C3 (Sect. 8.9) 
are dual to each other but not isomorphic to each other. 

Figure 8.5 shows the dual root system for the root system B2. On the left-hand 
side of the figure, the long roots have been normalized to have length /2. Thus, for 
each long root we have H, = g and for each short root we have Hy = 2a, yielding 
the root system on the right-hand side of the figure. 


y 


Fig. 8.5 The B, root system and its dual 
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8.4 Bases and Weyl Chambers 


We now introduce the notion of a base, or a system of positive simple roots, for a 
root system. 


Definition 8.12. If (E, R) is a root system, a subset A of R is called a base if the 
following conditions hold: 


1. A is a basis for E as a vector space. 

2. Each root a € R can be expressed as a linear combination of elements of A 
with integer coefficients and in such a way that the coefficients are either all 
non-negative or all nonpositive. 


The roots for which the coefficients are non-negative are called positive roots 
and the others are called negative roots (relative to the base A). The set of positive 
roots relative to a fixed base A is denoted Rt and the set of negative roots is denoted 
R~. The elements of A are called the positive simple roots. 


Note that since A is a basis for E, each œ can be expressed uniquely as a linear 
combination the elements of A. We require that A be such that the coefficients in 
the expansion of each a € R be integers and such that all the nonzero coefficients 
have the same sign. 


Proposition 8.13. If a and f are distinct elements of a base A for R, then 
(a, B) < 0. 


Geometrically, this means that either a and £ are orthogonal or the angle between 
them is obtuse. 


Proof. Sincea # B, if we had (a, 8) > 0, then the angle between a and 8 would be 
strictly between 0 and 7/2. Then, by Corollary 8.7, a—6 would be an element of R. 
Since the elements of A form a basis for E as a vector space, each element of R has 
a unique expansion in terms of elements of A, and the coefficients of that expansion 
are supposed to be either all non-negative or all nonpositive. The expansion of «— p, 
however, has one positive and one negative coefficient. Thus, œ — 6 cannot be a root, 
which means that (a, $} < 0. Oo 


The reader is invited to find a base for each of the rank-two root systems in 
Figure 8.3. We now show that every root system has a base. 


Proposition 8.14. There exists a hyperplane V through the origin in E that does 
not contain any element of R. 


Proof. For eacha € R, let V, denote the hyperplane 
Va = {H € E| (a, H) = 0}. 


Since there are only finitely many of these hyperplanes, there exists H € E not in 
any Vy. (See Exercise 2.) Let V be the hyperplane through the origin orthogonal to 
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H. Since H is not in any Vy, we see that H is not orthogonal to any a, which means 
that no « isin V. oO 


Definition 8.15. Let (E, R) be a root system. Let V be a hyperplane through the 
origin in E such that V does not contain any root. Choose one “side” of V and let 
R+ denote the set of roots on this side of V. An element a of R* is decomposable 
if there exist B and y in R* such that a = B + y; if no such elements exist, œ is 
indecomposable. 


The “sides” of V can be defined as the connected components of the set E \ V. 
Alternatively, if H is a nonzero vector orthogonal to V, then V is the set of y € E 
for which (u, H) = 0. The two “sides” of V are then the sets { u € E| (u, H) > 0} 
and {u € E| (uw, H} < 0}. 


Theorem 8.16. Suppose (E, R) is a root system, V is a hyperplane through the 
origin in E not containing any element of R, and R™ is the set of roots lying on a 
fixed side of V. Then the set of indecomposable elements of R* is a base for R. 


This construction of a base motivates the term “positive simple root” for the 
elements of a base. We first define the positive roots (RT) as the roots on one side of 
V and then define the positive simple roots (the base) as the set of indecomposable 
(or “simple’”) elements of R+. 


Proof. Let A denote the set of indecomposable elements in R*. Choose a nonzero 
vector H orthogonal to V so that the chosen side of V is the set of u € E for which 
(H, u) > 0. 


Step 1: Every a € R* can be expressed as a linear combination of elements of 
A with non-negative integer coefficients. If not, then among all of the elements 
of R* that cannot be expressed in this way, choose a so that (H, œ) is as small 
as possible. Certainly œ cannot be an element of A, so a must be decomposable, 
a = B; + Bo, with 61, 62 € R*. Now, f and f2 cannot both be expressible as 
linear combinations of elements of A with non-negative integer coefficients, or 
else a would be expressible in this way. However, (H,a) = (H, Bi) + (H, B2), 
and since the numbers (H, 6;) and (H, b2) are both positive, they must be 
smaller than (H, a), contradicting the minimality of a. 

Step 2: If a and B are distinct elements of A, then (a, B) < 0. Note that since 
A is not yet known to be a base, Proposition 8.13 does not apply. Nevertheless, 
if we had (a, B) > 0, then by Corollary 8.7, a — B and 6 — a would both be 
roots, one of which would have to be in R*. If œ — B were in R*, we would have 
a = (&— f) + f and œ would be decomposable. If, on the other hand, 6 —a were 
in R*, we would have £ = (8 — a) + a, and œ would, again, be decomposable. 
Since a and ĝ are assumed indecomposable, we must have (a, 6B) < 0. 

Step 3: The elements of A are linearly independent. Suppose we have 


X caa =0 (8.5) 


acd 


208 8 Root Systems 


for some collection of constants cy. Separate the sum into those terms where 
Cy > 0 and those where cy = —dy < 0, so that 


oe = X dpb (8.6) 


where the sums range over disjoint subsets of A. If u denotes the vector in (8.6), 
we have 


(u, u) = 32 Cal, > 48) 
= >>>) cadg (a, P). 


However, Cy and dy are non-negative and (by Step 2) (a, B) < 0. Thus, (u, u) < 0 
and u must be zero. 

Now, if u = 0, then (H,u) = }` ce (H, a) = 0, which implies that all the cg’s 
are zero since Cy > 0 and (H,a) > 0. The same argument then shows that all 
the d,’s are zero as well. 

Step 4: A is a base. We have shown that A is linearly independent and that all 
of the elements of Rt can be expressed as linear combinations of elements of A 
with non-negative integer coefficients. The remaining elements of R, namely the 
elements of R7, are simply the negatives of the elements of R”, and so they can 
be expressed as linear combinations of elements of A with nonpositive integer 
coefficients. Since the elements of R span E, then A must also span E and it is 
a base. o 


Figure 8.6 illustrates Theorem 8.16 in the case of the Gz root system, with a 
and œz being the indecomposable roots on one side of the dashed line. 


Theorem 8.17. For any base A for R, there exists a hyperplane V and a side of V 
such that A arises as in Theorem 8.16. 


Proof. If A = {a1,...,a@;} is a base for R, then A is a basis for E in the vector 
space sense. Then, by elementary linear algebra, for any sequence of numbers 
C1,...,Cy there exists a unique y € E with (y, a;) =c;,j =1,...,r.In particular, 
we can choose y so that (y, a i) > 0 for all j. Then if Rt denotes the positive 
roots with respect to A, we will have (y,@) > 0 for alla € Rt, since @ is a 
linear combination of elements of A with non-negative coefficients. Thus, all of the 
elements of R* lie on the same side of the hyperplane V = {H € E |(y, H) = 0}. 

Suppose now that a is an element of A and that œ were expressible as a sum 
of at least two elements of R*. Then at least one of these elements would be 
distinct from @ and, thus, not a multiple of œ. Thus, œ would be expressible as a 
linear combination of the elements of A with non-negative coefficients, where the 
coefficient of some 6 4 a would have to be nonzero. This would contradict the 
independence of the elements of A. We conclude, then, that every wa € A is an 
indecomposable element of R*+. Thus, A is contained in the base associated to V as 
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2a7n+3a 


Fig. 8.6 The roots a, and œ form a base for the root system G3 


in Theorem 8.16. However, the number of elements in a base must equal dim EF, so, 
actually, A is the base associated to V. Oo 


Proposition 8.18. Jf A is a base for R, then the set of all coroots Hy, a € A, is a 
base for the dual root system RY. 


Figure 8.7 illustrates Proposition 8.18 in the case of the root system Bo. 


Lemma 8.19. Let A be a base, R* the associated set of positive roots, and a an 
element of A. Then a cannot be expressed as a linear combination of elements of 
R+ \ {a} with non-negative real coefficients. 


Proof. Let a; = @ and let a,...,a@, be the remaining elements of A. Suppose 
qı is a linear combination elements 8 4 a in R* with non-negative coefficients. 
Each such £ can then be expanded in terms of a, . . . , œ, with non-negative (integer) 
coefficients. Thus, we end up with 


Oy = C11 + C272 +--+ + C,O,, 


210 8 Root Systems 


Ha, +2 Ho, 
Œ+ 
2a +a Hg, +g, 

Ha, 
Fig. 8.7 Bases for Bz and its dual 
with each c; > 0. Since 1,...,œ, are independent, we must have cı = 1 and 
all other c;’s equal 0. But since each ĝ in the expansion of œ has non-negative 
coefficients in the basis a1,...,@,, each 6 would have to be a multiple of a, 
and thus actually equal to a, since the only multiple of a in RY is a. But this 
contradicts the assumption that 6 was different from g1. o 


Proof of Proposition 8.18. Choose a hyperplane V such that the base A for R arises 
as in Theorem 8.16, and call the side of V on which A lies the positive side. Let Rt 
denote the set of positive roots in R relative to the base A. Then the coroots Hy, 
a € RT, also lie on the positive side of V, and all the remaining coroots lie on the 
negative side of V. Thus, applying Theorem 8.16 to RY, there exists a base AY for 
RY such that the positive roots associated to A are precisely the Hy’s witha € RY. 

Now, if a € R* buta é A, then @ is a linear combination of a,...,a@, with 
non-negative integer coefficients, at least two of which are nonzero. Thus, H, is a 
linear combination of Hy,,..., Ha, with non-negative real coefficients, at least two 
of which are nonzero. Since, by Lemma 8.19, such an H, cannot be in AY, the r 
elements of AY must be precisely Hy,,..., Ho,- o 


Definition 8.20. The open Weyl chambers for a root system (E, R) are the 
connected components of 


E= |] va, 


«ER 


where V, is the hyperplane through the origin orthogonal to a. If A = {a,...,a,} 
is a base for R, then the open fundamental Weyl chamber in £ (relative to A) is 
the set of all H in E such that (æ; , H) > O forall j =1,...,r. 


Figure 8.8 shows that open fundamental Weyl chamber C associated to a 
particular base for the Az root system. Note that the boundary of C is made up 
of portions of the lines orthogonal to the roots (not the lines through the roots). 

Since the elements of a base A form a basis for E as a vector space, elementary 
linear algebra shows that the open fundamental Weyl chamber is convex, hence 
connected, and nonempty. Since the only way one can exit a fundamental Weyl 
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Fig. 8.8 The shaded region C is the open fundamental Weyl chamber associated to the base 
{a, a} for A2 


chamber is by passing through a point H where (a j H ) = 0, the open fundamental 
Weyl chamber is, indeed, an open Weyl chamber. Note, also, that if (a free ) > 0 for 
all j =1,...,r, then (œ, H} > 0 for alla € R*, since æ is a linear combination 
of a,...,a@, with non-negative coefficients. 

Each w € W is an orthogonal linear transformation that maps R to R and, thus, 
maps the set of hyperplanes orthogonal to the roots to itself. It then easily follows 
that for each open Weyl chamber C, the set w- C is another open Weyl chamber. 

For any base A and the associated set R* of positive roots, we have defined the 
fundamental Weyl chamber to be the set of those elements having positive inner 
product with each element of A, and therefore also with each element of R*. Our 
next result says that we can reverse this process. 


Proposition 8.21. For each open Weyl chamber C, there exists a unique base Ac 
for R such that C is the open fundamental Weyl chamber associated to Ac. The 
positive roots with respect to Ac are precisely those elements a of R such that a 
has positive inner product with each element of C. 


Thus, there is a one-to-one correspondence between bases and Weyl chambers. 


Proof. Let H be any element of C and let V be the hyperplane orthogonal to H. 
Since H is contained in an open chamber, H is not orthogonal to any root, and, 
thus, V does not contain any root. Thus, by Theorem 8.16, there exists a base 
A = {a,...,a,} lying on the same side of V as H. Since (aj. H) has constant 
sign on C, we see that every element of C has positive inner product with each 
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element of A and thus with every element of the associated set R* of positive roots. 
Thus, C must be the fundamental Weyl chamber associated to A. We have just 
said that every œ € R* has positive inner product with every element of C, which 
means that each a € R7 has negative inner product with each element of C. Thus, 
R* consists precisely of those roots having positive inner product with C. 

Finally, if A’ is any base whose fundamental chamber is C, then each element 
of A’ has positive inner product with H € C, meaning that A’ lies entirely on the 
same side of V as A. Thus, A’ has the same positive roots as A and therefore also 
the same set of positive simple (i.e., indecomposable) roots as A. That is to say, 
A =A. o 


Proposition 8.22. Every root is an element of some base. 


Since C is the set of H € E for which (æj, H) > 0 for j = 1,...,r, 
the codimension-one pieces of the boundary of C will consist of portions of the 
hyperplanes Vy, orthogonal to the elements of A. Thus, to prove the proposition, 
we merely need to show that for every root a, the hyperplane V, contains a 
codimension-one piece of the boundary of some C. 


Proof. Leta be aroot and V, the hyperplane orthogonal to œ. If we apply Exercise 2 
to Va we see that there exists some H € Vy such that H does not belong to Vg for 
any root 6 other than 6 = +a. Then for small positive £, the element H + ea will 
be in a open Weyl chamber C. 

We now claim that œ must be a member of the base Ac = {a),...,a@,} in 
Proposition 8.21. Since œ has positive inner product with H + ea e C, we 
must at least have a € RT, the set of positive roots associated to Ac. Write 
a= 2E cjæj, with c; > 0, so that 


0 = (æ, H) = $` cj (aj, H). 


= 


Since H is clearly in the closure of C, we must have (H 5a j) > 0, which means that 
(æ j H ) must be zero whenever cj 0. If more than one of the c;’s were nonzero, 
H would be orthogonal to two distinct roots in A, contradicting our choice of H. 
Thus, actually, œ is in A. oO 


8.5 Weyl Chambers and the Weyl Group 


In this section, we establish several results about how the action of the Weyl group 
relates to the Weyl chambers. 
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Proposition 8.23. The Weyl group acts transitively on the set of open Weyl 
chambers. 


Proposition 8.24. If A is a base, then W is generated by the reflections Sy with 
aed. 


Proof of Propositions 8.23 and 8.24. Fix a base A, let C be the fundamental cham- 
ber associated to A, and let W’ be the subgroup of W generated by the s,’s with 
a € A. Let D be any other chamber and let H and H’ be fixed elements of C and 
D, respectively. We wish to show that there exists w € W” so that w- H’ belongs to 
C. To this end, choose w € W” so that |w - H’ — H| is minimized, which is possible 
because W’ C W is finite. If w- H’ were not in C, there would be some a € A such 
that (œ, w- H’) < 0. In that case, direct calculation would show that 


|w: H! — H|? — |sa +w: H! — H|? = —4(a,w- H') (a, H) > 0, 


which means that sa-w- H’ is closer to H than w- H” is (Figure 8.9). Since sy € W’, 
this situation would contradict the minimality of w. Thus, actually, w - H’ € C, 
showing that D can be mapped to C by an element of W’. Since any chamber 
can be mapped to C by an element of W’, any chamber can be mapped to any 
other chamber by an element of W’. We conclude that W’, and thus also W, acts 
transitively on the chambers, proving Proposition 8.23. 


Fig. 8.9 If w- H’ and H are on opposite sides of the line orthogonal to a, then sy, © w + H” is 
closer to H than w- H” is 
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To prove Proposition 8.24, we must show that W” = W. Let sq be the reflection 
associated to an arbitrary root a. By Proposition 8.22, a belongs to Ap for some 
chamber D. If w € W” is chosen so that w - D = C, then w-a will belong to A. 
Now, it is easily seen that 


Swa = WSyw 


so that sy = w !s,.~w. But since w -œ € A, both w and Swa belong to W’. Thus, 
Sq belongs to W” for every root a, which means that W (the group generated by the 
Sq’s) equals W”. Oo 


Proposition 8.25. Let C be a Weyl chamber and let H and H' be elements of Č, 
the closure of C. Ifw - H = H' for some w € W, then H = H’. 


That is to say, two distinct elements of C cannot be in the same orbit of W. 

By Proposition 8.24, each w € W can be written as a product of reflections from 
A. If k > 0 is the smallest number of reflections from A needed to express w, then 
any expression for w as the product of k reflections from A is called a minimal 
expression for w. Such a minimal expression need not be unique. 

The following technical lemma is the key to the proof of Proposition 8.25. 


Lemma 8.26. Let A be a base for R and let C be the associated fundamental Weyl 
chamber. Let w be an element of W with w # I and let W = Sq, Su. *** Sap, with 
a; € A, be a minimal expression for w. Then C and w- C lie on opposite sides of 
the hyperplane Vy, orthogonal to a. 


Proof. Since w # I, we must have k > 1. If k = 1, then w = Sa, so that 
w-C = Sq, -C is on the opposite side of Va, from C. Assume, inductively, that the 
result holds for u € W where the minimal number of reflections needed to express 
uis k — 1. Then consider w € W having a minimal expression of the form w = 
Sa, +t Sq, Tf we let u = Sq, +++ So,_,, then Sq, «++ Sq,_, Must be a minimal expression 
for u, since any shorter expression for u would result in a shorter expression for 
W = USy,. Thus, by induction, u + C and C must lie on opposite sides of Va. 
Suppose, toward a contradiction, that w- C lies on the same side of Vy, as C. Then 
u-C andw-C = Uus, -C lie on opposite sides of V,,, which implies that C and 
Sa, °C lie on opposite sides of u`! Vy, = Vite: 

We now claim that there is only one hyperplane Vg, with B € R, such that C 
and Sg, - C lie on opposite sides of Vg, namely, Vg = Va,. After all, as in the 
proof of Proposition 8.22, we can choose H in the boundary of C so that H lies in 
Va, but in no other hyperplane orthogonal to a root. We may then pass from C to 
Sq, © C along a line segment of the form H + tay, —e < t < £, and we will pass 
through no hyperplane orthogonal to a root, other than Vj,. We conclude, then, that 
Vila = Vag: 

Now, if V,-1.4, = Va,» it follows that s,-1.,, = Sap» SO that 


-1 
Sap = Sula ZU Solt. (8.7) 
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Substituting (8.7) into the formula w = usy, gives 
_ _— 2 
W = Sa U = Sq, Son *** Soy 


Since s? = I, we conclude that w = So --+Sq,_,, Which contradicts the assumption 
that sq, *** Sa, Was a minimal expression for w. oO 


Proof of Proposition 8.25. We proceed by induction on the minimal number of 
reflections from Ac needed to express w. If the minimal number is zero, then w = T 
and the result holds. If the minimal number is greater than zero, let w = Sq, +++ So, 
be a minimal expression for w. By Lemma 8.26, C and w- C lie on opposite sides 
of the hyperplane Vg, orthogonal to œı. Thus, 


(w- C) AČ C Va, 
which means that w- H = H’ must lie in V,,,. It follows that 
Sa w: H = Ssa H' = H’. 
That is to say, the Weyl group element w’ := Saw also maps H to H’. But 
WS Sa W = SG, sag Sap = ay 


is a product of fewer than k reflections from Ac. Thus, by induction, we have 
H'=H. o 


Proposition 8.27. The Weyl group acts freely on the set of open Weyl chambers. If 
H belongs to an open chamber C and w ; H = H for some w € W, thenw = I. 


Proof. Suppose w- C = C for some w € W and some chamber C. Then w- H 
belongs to C for all H € C, which means, by Proposition 8.25, that w- H = H for 
all H € C. Thus, w is the identity on all of C, which means w must be the identity 
on all of E. After all, if w is not the identity, the eigenspace of w with eigenvalue 1 
is a subspace of E of codimension at least 1, and this eigenspace cannot contain the 
nonempty open set C. 

Meanwhile, if H belongs to an open chamber C and w: H = H, then w- C must 
equal C, so that w = T. o 


Proposition 8.28. For any two bases A; and Az for R, there exists a unique w € W 
such that w - A, = Ao. 


Proof. By Proposition 8.21, there is a bijective correspondence between bases and 
open Weyl chambers, and this correspondence is easily seen to respect the action of 
the Weyl group. Since, by Propositions 8.23 and 8.27, W acts freely and transitively 
on the chambers, the same is true for bases. o 


Proposition 8.29. Let C be a Weyl chamber and H an element of E. Then there 
exists exactly one point in the W -orbit of H that lies in the closure C of C. 
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We are not saying that there is a unique w such that w- H € C, but rather that 
there exists a unique point H’ € C such that H’ can be expressed (not necessarily 
uniquely) as H’ = w- H. 


Proof. If U is any neighborhood of H, then by the argument in Exercise 2, the 
hyperplanes V,, a € R, cannot fill up U, which means U contains points in some 
open Weyl chamber. It follows that H belongs to D for some chamber D. By 
Proposition 8.23, there exists w € W such that w- D = C and, thus, w - D=C. 
Thus, H’ := w-H isin Č. Meanwhile, if H” is a point in the W -orbit of H such that 
H” € Č, then H’ and H” lie in the same W-orbit, which means (Proposition 8.25) 
that H’ = H”. o 


Proposition 8.30. Let A be a base for R, let R* be the associated set of positive 
roots, and let a be an element of A. If B € R* and B # a, then sa +B € R*. That 
is to say, ifa € A, then Sy permutes the positive roots different from a. 


Proof. Write B = } ` ea Cy@y with c, > 0. Since B # œ, there is some cy with 
y #aandc, > 0. Now, since sy- P = p — na for some integer n, in the expansion 
of Sa- Ê, only the coefficient of a has changed compared to the expansion of 6. Thus, 
the coefficient of y in the expansion of S - 8 remains positive. But if one coefficient 
of Sq - Ê is positive, all the other coefficients must be non-negative, showing that 
Sq + P is a positive root. Oo 


8.6 Dynkin Diagrams 


A Dynkin diagram is a convenient graphical way of encoding the structure of a base 
for a root system R, and thus also of R itself. 


Definition 8.31. If A = {a1,...,a@,-} is a base for a root system R, the Dynkin 
diagram for R is a graph having vertices v1, ..., V,. Between two distinct vertices 
vj and vg, we place zero, one, two, or three edges according to whether the angle 
between a; and a, is 1/2, 27/3, 37/4, or 57/6. In addition, if œ; and œg are not 
orthogonal and have different lengths, we decorate the edges between v; and vg 
with an arrow pointing from the vertex associated to the longer root to the vertex 
associated to the shorter root. 


Note that by Proposition 8.6, angles of 27/3, 32/4, or 52/6 correspond to 
length ratios of 1, /2, /3, respectively. Thus, the number of edges between vertices 
corresponding to two nonorthogonal roots is 1, 2, or 3 according to whether the 
length ratio (of the longer to the shorter) is 1, /2, or /3. Thinking of the arrow 
decorating the edges as a “greater than” sign helps one to recall which way the 
arrow should go. 

Two Dynkin diagrams are said to be isomorphic if there is a one-to-one, onto 
map of the vertices of one to the vertices of the other that preserves the number of 
bonds and the direction of the arrows. By Proposition 8.28, any two bases A, and 
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A,xA, Ay 

a re O C O 
B, Gə 


Fig. 8.10 The Dynkin diagrams for the rank-two root systems 


A> for a fixed root system are related by the action of a unique Weyl group element 
w. Since w preserves angles and lengths, the Dynkin diagrams associated to two 
different bases for the same root system are isomorphic. 

In the case of the root system G2, for example, a base consists of two roots at 
angle of 52/6, with a length ratio of /3 (Figure 8.6). Thus, the Dynkin diagram 
consists of two vertices connected by three edges, with an arrow pointing from the 
longer root (@2) to the shorter (a). One can similarly read off the Dynkin diagram 
for Az from Figure 6.2 and the diagram for B2 from Figure 8.7. Finally, for A; x Aj, 
the two elements of the base are orthogonal, yielding the results in Figure 8.10. 


Proposition 8.32. /. A root system is irreducible if and only if its Dynkin diagram 
is connected. 

2. If the Dynkin diagrams of two root systems R, and R, are isomorphic, then R; 
and R, themselves are isomorphic. 


Proof. For Point 1, if a root system R decomposes as the direct sum of two root 
systems Rı and R3, then we can obtain a base A for R as A = A, U Ao, where A, 
and A, are bases for R; and Ro, respectively. Since elements of R; are orthogonal 
to elements R2, each element of A, is orthogonal to each element of A2. Thus, the 
Dynkin diagram associated to A is disconnected. 

Conversely, suppose the Dynkin diagram of R is disconnected. Then the base A 
decomposes into two pieces A; and A» where (since there are no edges connecting 
the pieces), each element of A, is orthogonal to each element of A». Thus, E is the 
orthogonal direct sum of E; := span(A,) and Fy := span(A2). If Ri = RN EF 
and Ry = RN Ey, then it easy to check that R; and R3 are root systems in E; and 
Ep, respectively, and that A, is a base for R; and Az is a base for Ro. 

Now, foreacha € Aj, the reflection s, will act as the identity on Ey and similarly 
fora € A>. Since the Weyl groups of R, Ri, and R3 are generated by the reflections 
from A, Aj, and Ag, respectively, we see that the Weyl group W of R is the direct 
product of the Weyl groups W; and W3 of R, and R2, with W, acting only on Fy 
and W, acting only on E>. Since W acts transitively on the bases of R and every 
element of R is part of some base, we see that W - A = R. But since W = W, x Wa, 
we have W- A = (W,- A1) U (W - A2). Thus, every element of R is either in Ey 
or in E2, meaning that R is the direct sum of R; and R2. 
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For Point 2, using Point 1, we can reduce the problem to the case in which R, and 
R, are irreducible and the Dynkin diagrams of R; and R3 are connected. Let A; = 
{a,,...,@,} and A2 = {fj,...,8,} be bases for Rı and Ro, respectively, ordered 
so that the isomorphism of the Dynkin diagrams maps the vertex associated to a; to 
the vertex associated to B;. We may rescale the inner product on R; so that ||o || = 
|| 81 ||. Since the Dynkin diagrams are connected and the diagram determines the 
length ratios between vertices joined by an edge, it follows that (a Ja atx) = (B JS Bx) 
for all j and k. It is then easy to check that the unique linear map A: E; —> E2 
such that Aa; = j, j = 1,...,r is an isometry. We then have that Asa, = sg, A 
for all j. 

Now, if œ is any element of R,, then, since W, is generated by the reflections 
from A, and W; - A = Rj, we see that 


= Saj ` Ser jy Hh 
for some indices j;,..., jy and k. Thus, 
Aa = Sp; +++ 8p), Êk 


is an element of R2. The same reasoning shows that AB € Rı forall $ € Ro. 
Thus, A is an isometry mapping R; onto R2, which implies that A is an isomorphism 
of Rı with R2. Oo 


Corollary 8.33. Let g = tc be a semisimple Lie algebra, let } = tc be a Cartan 
subalgebra of g, and let R C it be the root system of g relative to h. Then g is simple 
if and only if the Dynkin diagram of R is connected. 


Proof. According to Theorem 7.35, g is simple if and only if R is irreducible. But 
according to Point | of Proposition 8.32, R is irreducible if and only if the Dynkin 
diagram of R is connected. o 


8.7 Integral and Dominant Integral Elements 


We now introduce a notion of integrality for elements of E. In the setting of the 
representations of a semisimple Lie algebra g, the weights of a finite-dimensional 
representation of g are always integral elements. Recall from Definition 8.10 the 
notion of the coroot Hy, associated to a root œ. 


Definition 8.34. An element u of E is an integral element if for all œ in R, the 
quantity 


(u, a) 
(a, æ) 


(u, Ha) =2 
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is an integer. If A is a base for R, an element u of E is dominant (relative to A) if 
(a, u) = 0 

for all a € A and strictly dominant if 
(a, u) > 0 


forall œ € A. 


See Figure 6.2 for the integral and dominant integral elements in the case of A2. 
Additional examples will be given shortly. 

A point u € E is strictly dominant relative to A if and only if jz is contained 
in the open fundamental Weyl chamber associated to A, and u is dominant if 
and only if u is contained in the closure of the open fundamental Weyl chamber. 
Proposition 8.29 therefore implies the following result: For all u € E, there exists 
w E€ W such that w- u is dominant. 

Note that by the definition of a root system, every root is an integral element. 
Thus, every integer linear combination of roots is also an integral element. In most 
cases, however, there exist integral elements that are not expressible as an integer 
combination of roots. In the case of A2, for example, the elements labeled jz; and 
H2 in Figure 6.2 are integral, but their expansions in terms of a; and a are yı = 
2a, /3 + @2/3 and u2 = a) /3 + 2a2/3. Since a, and a form a base for Ao, if mı 
or u2 were an integer combination of roots, it would also be an integer combination 
of a; and a. 


Proposition 8.35. If u € E has the property that 


(u,a) 
(a, a) 


2 


is an integer for alla € A, then the same holds for all a € R and, thus, u is an 
integral element. 


Proof. An element u is integral if and only if (u, Ha) is an integer for alla € R. 
By Proposition 8.18, each Hy with œ € R can be expressed as a linear combination 
of the H,’s with œ € A, with integer coefficients. Thus, if (u, Ha) € Z fora € A, 
then same is true fora € R, showing that u is integral. oO 


Definition 8.36. Let A = {a1,...,a,} be a base. Then the fundamental weights 


(relative to A) are the elements 41, ..., 44, with the property that 
Hj, Ak š 
a “a =i j,k=1,...,r. (8.8) 


Elementary linear algebra shows there exists a unique set of integral elements 
satisfying (8.8). Geometrically, the jth fundamental weight is the unique element 
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of E that is orthogonal to each a;, j # k, and whose orthogonal projection onto 
a; is one-half of œ;. Note that the set of dominant integral elements is precisely the 
set of linear combinations of the fundamental weights with non-negative integer 
coefficients; the set of all integral elements is the set of linear combinations of 
fundamental weights with arbitrary integer coefficients. 


Definition 8.37. Let A be a base for R and R® the associated set of positive roots. 
We then let 6 denote half the sum of the positive roots: 


sio 


aeRt 


The element 5 plays a key role in many of the developments in subsequent 
chapters. It appears, for example, in the statement of the Weyl character formula 
and the Weyl integral formula. Figure 8.11 shows the integral and dominant integral 
elements for Bz and G2. In each case, the base {a , @2} and the element ô are labeled, 
and the fundamental weights are circled. The background square lattice (Bz case) or 
triangular lattice (G, case) indicates the set of all integral elements. The root system 
Gh» is unusual in that the fundamental weights are roots, which means that every 
integral element is expressible as an integer combination of roots. 


Proposition 8.38. The element 6 is a strictly dominant integral element; indeed, 


(B.8) 
*16.B) 


foreach B € A. 


Proof. If B € A, then by Proposition 8.30, sg permutes the elements of R* different 
from £. Thus, we can decompose Rt \ {8} as Ej U E2, where elements of E; are 
orthogonal to 6 and elements of E, are not. Then the elements of E> split up into 
pairs {œ, sg - a}, where 


(sg - æ, B) = (a, sg - B) = — (a, B). 


Thus, when we compute the inner product of 6 with half the sum of the positive 
roots, the roots in Æ; do not contribute and the contributions from roots in E> cancel 
in pairs. Thus, only the contribution from £ itself remains: 


(8.8) _ (8.28) _ 


*B.B) BB) 


as claimed. oO 
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Fig. 8.11 Integral and 
dominant integral elements 
for B2 (top) and G2 (bottom). 
The black dots indicate 
dominant integral elements, 
while the background lattice 
indicates the set of all integral 
elements 
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We now introduce a partial ordering on the set of integral elements, which will 
be used to formulate the theorem of the highest weight for representations of a 


semisimple Lie algebra. 
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Fig. 8.12 Points that are 
higher than zero (light and 
dark gray) and points that are 
dominant (dark gray) for B2 


Definition 8.39. If A = {a),...,a@,} is a base, an element u € E is said to be 
higher than À € E (relative to A) if u — A can be expressed as 


HWA = C1, +++ + C/O, 
where each c; is a non-negative real number. We equivalently say that À is lower 


than u and we write this relation as y > A or À < yp. 


The relation > defines a partial ordering on E. We now develop various useful 
properties of this relation. For the rest of the section, we assume a base A for R has 
been chosen, and that the notions of higher, lower, and dominant are defined relative 
to A. 


Proposition 8.40. If jc € E is dominant, then u > 0. 


See Figure 8.12 for an example of the proposition. 
For any basis {v,,...,v,} of E, we can form a basis for the dual space E* by 
considering the linear functionals £; given by £; (vk) = 6j. We can then find unique 


vectors v7 € E such that £; (u) = (vy, u), so that 


The basis {uf,..., uv; } for Æ is called the dual basis to {v;,...,v;}. 
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Lemma 8.41. Suppose {v1,...,v;} is an obtuse basis for E, meaning that 
(vj, vg) < 0 forall j # k. Then {vf,..., v7} is an acute basis for E, meaning that 
(vf, vf) > 0 forall j,k. 

The proof of this elementary lemma is deferred until the end of this section. 
Proof of Proposition 8.40. Any vector u can be expanded as u = >> j cj and the 


coefficients may be computed as c; = (or, u), where {œ} } is the dual basis to {a;}. 


Applying this with u = œ} gives 


i 
x o * x 
aj = D (orto) a. 
k=l 
Now, if u is dominant, the coefficients in the expansion u = }` j ejay are given by 


cj = (o.u) = D(z. e77) (cre, 1) 
k 


Since u is dominant, (a, u} > 0. Furthermore, the a;’s form an obtuse basis for 


E (Proposition 8.13) and thus by Lemma 8.41, we have (a7, a) > 0 for all j,k. 


Thus, c; > 0 for all j, which shows that u is higher than zero. o 
Proposition 8.42. If u is dominant, then w; u < u forallw € W. 


Proof. Let O be the Weyl-group orbit of u. Since O is a finite set, it contains a 
maximal element À, i.e., one such that there is no A’ Æ A in O that is higher than À. 
Then for all œ € A, we must have (a, A) > 0, since if (œ, A) were negative, then 


(A, a) 
(æ, æ) 


Sy:A=A-2 a 


would be higher than A. Thus, À is dominant. But since, by Proposition 8.29, u is 
the unique dominant element of O, we must have A = ju. Thus, ju is the unique 
maximal element of O. 

We now claim that every element of O is lower than u. Certainly, no element 
of O can be higher than jz. Let O’ be the set of A € O that are neither higher nor 
lower than jz. If O’ is not empty, it contains a maximal element A. We now argue 
that A is actually maximal in O. For any y € O, if y € O’, then certainly y cannot 
be higher than À, which is maximal in O’. On the other hand, if y € O \ O’, then 
y is lower than u, in which case, y cannot be higher than À, or else we would have 
u = y =A, so that 2 would not be in O’. We conclude that A is maximal in O, not 
just in O’, which means that A must equal jx, the unique maximal element of O. But 
this contradicts the assumption that A € O’. Thus, O’ must actually be empty, and 
jw is the highest element of O. o 
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Proposition 8.43. If u is a strictly dominant integral element, then u = 6, where 6 
is as in Definition 8.37. 


Proof. Since m is strictly dominant, u — ô will still be dominant in light of 
Proposition 8.38. Thus, by Proposition 8.40, 4 — ô > 0, which is equivalent to 
wx. o 


Recall from Definition 6.23 the notion of the convex hull of a finite collection 
of vectors in E. We let W - u denote the Weyl-group orbit of u € E and we let 
Conv(W - u) denote the convex hull of W - u. 


Proposition 8.44. 1. If ņ and à are dominant, then à belongs to Conv(W - m) if 
and only if à < u. 

2. Let u and i be elements of E with u dominant. Then à belongs to Conv(W - p) 
if and only ifw- à < u forallw € W. 


Figure 8.13 illustrates Point 2 of the proposition in the case of Bo. In the figure, 
the shaded region represents the set of points that are lower than jz. The point A, is 
inside Conv(W - jz) and w- A, is lower than u for all w. By contrast, A» is outside 
Conv(W - u) and there is some w for which w - A> is not lower than u. 

Since Conv(W - u) is convex and Weyl invariant, we see that if A belongs to 
Conv(W - u), then every point in Conv(W - À) also belongs to Conv(W - jz). Thus, 
Point | of the proposition may be restated as follows: 


If u and A are dominant, then A < yp if and only if 
Conv(W - A) C Conv(W - u). 


We establish two lemmas that will lead to a proof of Proposition 8.44. 


Lemma 8.45. Suppose K is a compact, convex subset of E and X is an element of 
E that it is not in K. Then there is an element y of E such that for all n € K, we 
have 


(y,A) > (v.n). 


If we let V be the hyperplane (not necessarily through the origin) given by 


V = {p € E| (y, p) = (yn) — £} 


for some small £, then K and A lie on opposite sides of V. Lemma 8.45 is a special 
case of the hyperplane separation theorem in the theory of convex sets. 


Proof. Since K is compact, we can choose an element 79 of K that minimizes the 
distance to A. Set y = A — no, so that 


(y, à — no) = (À — no, À — no) > 9, 


and, thus, (y, à) > (y, no). 
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à2 


Fig. 8.13 The element w- À% is not lower than jz 


Now, for any n € K, the vector no + s(n — no) belongs to K for 0 < s < 1, and 
we compute that 


d(A, no + s(n — no)” = (A= no, À — no) — 2s (A — no, n — No) 
+ s* (n — no, n= No) - 


The only way this quantity can be greater than or equal to (À — no, À — no) = 
d(A, no)? for small positive s is if 


(A = no.n — no) = (y, n — no) < 9. 
Thus, 
(y, n) < (y, no) < (yA). 


which is what we wanted to prove. o 


Lemma 8.46. If u and à are dominant and à ¢ Conv(W - u), there exists a 
dominant element y € E such that 


(y, A) > (yw: u) (8.9) 


forallw e€ W. 
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Proof. By Lemma 8.45, we can find some y in E, not necessarily dominant, such 
that (y, à) > (y, n) for all n € Conv(W - jz). In particular, 


(y,A) > (yw u) 


for all w € W. Choose some wy so that y’ := wọ- Yo is dominant. We will show that 
replacing y by y’ makes (y, A) bigger while permuting the values of (y, w- u). 

By Proposition 8.42, y < y’, meaning that y’ equals y plus a non-negative linear 
combination of positive simple roots. But since À is dominant, it has non-negative 
inner product with each positive simple root, and we see that (y’,A) > (y, à}. Thus, 


(y, A) = (yA) > (y,w- u) 


for all w. But 


(y,w- u) = (wo! y', w: u) = (y’, (wow) - y). 


Thus, as w ranges over W , the values of (y, w - u) and (y’, (wow) - u) range through 
the same set of real numbers. Thus, (y’, à} > (A’,w- w) for all w, as claimed. O 


The proof of Lemma 8.46 is illustrated in Figure 8.14. The dominant element À 
is not in Conv(W - u) and is separated from Conv(W - u) by a line with orthogonal 
vector y. The element y’ := Sy, - y is dominant and À is also separated from 


Fig. 8.14 The element À is separated from Conv(W - jz) first by a line orthogonal to y and then 
by a line orthogonal to the dominant element y’ 
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Conv(W - u) by a line with orthogonal vector y’. The existence of such a line means 
that A cannot be lower than p. 


Proof of Proposition 8.44. For Point 1, let u and A be dominant. Assume first that 
À is in Conv(W - u). By Proposition 8.42, every element of the form w - u is lower 
than u. But the set E of elements lower than pu is easily seen to be convex, and 
so E must contain Conv(W - jz) and, in particular, A. Next, assume A < ju and 
suppose, toward a contradiction, that A ¢ Conv(W - u). Let y be a dominant 
element as in Lemma 8.46. Then u — A is a non-negative linear combination of 
positive simple roots, and y, being dominant, has non-negative inner product with 
each positive simple root. Thus, (y, u — A) > 0 and, hence, (y, u) > (y,A), which 
contradicts (8.9). Thus, A must actually belong to Conv(W - u). 

For Point 2, assume first that w- À < u for all w € W, and choose w so that w- A 
is dominant. Since, w- A < u, Point 1 tells us that w- A belongs to Conv(W - u), 
which implies that A also belongs to Conv(W - jz). In the other direction, assume 
À € Conv(W - u) so that w-A € Conv(W - jz) for all w € W. Using Proposition 8.42 
we can easily see that every element of Conv(W - jz) is lower than jz. Thus, w-A < u 
for all w. oO 


It remains only to supply the proof of Lemma 8.41. 


Proof of Lemma 8.41. We proceed by induction on the dimension r of E. When 
r = | the result is trivial. When r = 2, the result should be geometrically obvious, 
but we give an algebraic proof. The Gram matrix of a basis is the collection of inner 
products, Giz := (v J3 vg). It is an elementary exercise (Exercise 3) to show that the 
Gram matrix of the dual basis is the inverse of the Gram matrix of the original basis. 
Thus, in the r = 2 case, we have 


= 1 ( (v2, v2) a) 
= A (8.10) 


((v1, v1) (v2, v2) — (vi, v2)?) \— (v1, v2) (1, U1) 


Since vı is not a multiple of v2, the Cauchy—Schwarz inequality tells us that the 
denominator on the right-hand side of (8.10) is positive, which means that (vt, v3 ) 
has the opposite sign of (v1, v2). 

Assume now that the result holds in dimension r > 2 and consider the case of 
dimension r + 1. Fix any index m and let P be the orthogonal projection onto the 
orthogonal complement of vm, which is given by 


(Um; u) 


P(u) =u-— A 


m-* 


The operator P is easily seen to be self-adjoint, meaning that (u, Pv) = (Pu, v) 
for all u, v. 
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We now claim that Pvj,... Pvm, ..., Pv,41 is an obtuse basis for (Un), where 
the notation Pvm indicates that P vm is omitted. Indeed, a little algebra shows that 


(um, vj) (Um, UE) 


(Pv;, Pvg) = (vj, UK) — Bits 


< 0, 


since (vj, vk), (Un, vj), and (vm, vg) are all less than or equal to zero. Furthermore, 
for j and k different from m, we have 


(v7, Pre) = (Pry, ve) = (vs, ve) = bi 


since ve is orthogonal to vm. Thus, the dual basis to Py,... , Pvn, wee Pvr41 
consists simply of the vectors vř,...,vň»---»v*ųı (all of which are orthogonal 
tO Um). 


Now fix any two distinct indices j and k. Since r + 1 > 3, we can choose some 
other index m distinct from both j and k. Applying our induction hypothesis to the 


basis Pvj,... , Pvn, ...,Pv,r41 for (um), we conclude that (vy, vf) > 0, which is 
what we are trying to prove. 


8.9 Examples in Rank Three 


In rank three, we can have a reducible root system, which must be a direct sum of 
A, with one of the rank-two root systems described in the previous section. In this 
section, we will consider only the irreducible root systems of rank three. There are, 
up to isomorphism, three irreducible root systems in rank three, customarily denoted 
A3, B3, and C3. They arise from the Lie algebras sl(4; C), so(7; C), and sp(3; ©), 
respectively, as described in Sect. 7.7. 

The models in this section can be constructed using the Zometool system, 
available at www.zometool.com. The reader is encouraged to obtain some Zometool 
pieces and build the rank-three root systems for him- or herself. The models require 
the green lines, which are not part of the basic Zometool kits. The models of the 
C3 root system use half-length greens, although one can alternatively use whole 
greens together with double (end to end) blue pieces. The images shown here 
were rendered in Scott Vorthmann’s vZome software, available at vzome.com. For 
detailed instructions on how to build rank-three root systems using Zometool, click 
on the “Book” tab of the author’s web site: www.nd.edu/~bhall/. 

Figure 8.15 shows the A; root system, with a base highlighted. The elements 
of A3 form the vertices of a polyhedron known as a cuboctahedron, which has six 
square faces and eight triangular faces, as shown in Figure 8.16. The points in A3 
can also be visualized as the midpoints of the edges of a cube, as in Figure 8.17. 
Algebraically, we can describe A3 as the set of 12 vectors in R? of the form 
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Fig. 8.15 The A; root 
system, with the elements of 
the base in dark gray 


Fig. 8.16 The roots in A3 
make up the vertices of a 
cuboctahedron 


| 
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(+1, +1, 0), (+1, 0, +1), and (0, +1, +1). (This set of vectors actually corresponds 
to the conventional description of the D3 root system, as in Sect. 7.7.2, which turns 
out to be isomorphic to A3.) It is then a simple exercise to check that this collection 
of vectors is, in fact, a root system. A base for this system is given by the vectors 
(1,-1,0), (0, 1, —1), and (0, 1, 1). 

The Weyl group W for A3 is the symmetry group of the tetrahedron pictured in 
Figure 8.18. The group W is the full permutation group on the four vertices of the 
tetrahedron. As in the Az case, the Weyl group of A3 is not the full symmetry group 
of the root system, since —J is not an element of W. 

The B3 root system is obtained from the A3 root system by adding six additional 
vectors, consisting of three mutually orthogonal pairs. Each of the new roots 
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Fig. 8.17 The roots in A; lie 
at the midpoints of the edges 
of a cube | 


Fig. 8.18 The Weyl group of 
A; is the symmetry group of 
a regular tetrahedron 


is shorter than the original roots by a factor of /2, as shown in Figure 8.19. 
Algebraically, B3 consists of the twelve vectors (+1,+1,0), (+1,0, +1), and 
(0,+1,+1) of A3, together with the six vectors (+1,0,0), (0,+1,0), and 
(0,0, +1). 

The C3 root system, meanwhile, is obtained from A; by adding six new vectors, 
as in the case of B3, except that this time the new roots are longer than the original 
roots by a factor of /2, as in Figure 8.20. That is to say, the new roots are the 
six vectors (+2, 0, 0), (0, +2, 0), and (0,0, +2). The C3 root system is the dual of 
B3, in the sense of Definition 8.10. The elements of C3 make up the vertices of an 
octahedron, together with the midpoints of the edges of the octahedron, as shown in 
Figure 8.21. 
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Fig. 8.19 The B3 root system, with the elements of the base in dark gray 


Fig. 8.20 The C3 root system, with the elements of the base in dark gray 
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Fig. 8.21 The C3 root system consists of the vertices of an octahedron, together with the midpoints 
of the edges of the octahedron 


The root systems B3 and C3 have the same Weyl group, which is the symmetry 
group of the cube in Figure 8.17. In both cases, the Wey] group is the full symmetry 
group of the root system. 


8.10 The Classical Root Systems 


We now return to the root systems of the classical semisimple Lie algebras, as 
computed in Sect. 7.7. For each of these root systems, we describe a base and 
determine the associated Dynkin diagram. 


8.10.1 The A, Root System 


The A, root system is associated to the Lie algebra sl(n + 1; C). For this root system, 
E is the subspace of R”+! consisting of vectors whose entries sum to zero. The roots 
are the vectors of the form 


ej — ek, JT#K, 
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where {e ; } is the standard basis for R”+1, As a base, we may take the vectors 
€1 — 2, €2 — €3, ..., Cn — Cnt 
Note that for j < k, 
e; = ek = (ej — efi) + (Cpa = 24a) +: + (Ge — er), 


so that every root is a sum of elements of the base, or the negative thereof. 
All roots in the base have the same length, two consecutive roots are at an angle 
of 27/3 to one another, and nonconsecutive roots are orthogonal. 


8.10.2 The D, Root System 


The D, root system is associated to the Lie algebra so(2n; C), n > 2. For this root 
system, E = R” and the roots are the vectors of the form 


te;te, j <k. 
As a base, we may take the n — 1 roots 
€1 — €2, €2 — €3, 7" 5 ,€n—2 — Cn=1, Cn—-1 — Cn, (8.11) 
together with the one additional root, 
Cn-1 + en. (8.12) 
Note that for j < k, we have the following formulas: 
ej =e = (ej — ej 41) + (ej+1 —82j+2) +: + (ek-1 — €k), 
ej +n = (ej — Cn-1) + (Cn-1 + en), 
ej ten = (ej + en) + (ek — en). (8.13) 


This shows that every root of the form ej — eg or ej + ex (j < k) can be written 
as a linear combination of the base in (8.11) and (8.12) with non-negative integer 
coefficients. The roots of this form are the positive roots, and the remaining roots 
are the negatives of these roots. 

Two consecutive roots in the list (8.11) have an angle of 27/3 and two 
nonconsecutive roots in the list (8.11) are orthogonal. The angle between the root 
in (8.12) and the second-to-last element in the list (8.11) is 27/3; the root in (8.12) 
is orthogonal to all the other roots in (8.11). 
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8.10.3 The B, Root System 


The B, root system is associated to the Lie algebra So(2n + 1; C). For this root 
system, E = R” and the roots are the vectors of the form 


tejte, j<k, 
and of the form 
tej, jf =1,...,n. 
As a base for our root system, we may take the n — 1 roots 
e1 — €2, €2 — €3, ..., Cn—-1 — ns (8.14) 
(exactly as in the so(2n; C) case) together with the one additional root, 
En- (8.15) 
The positive roots are those of the form e; + ex or ej — ex (j < k) and those of the 


forme; (1 < j < n). To expand every positive root in terms of the base, we use the 
formulas in (8.13), except with the second line replaced by 


ej + en = (ej — en) + 2en, (8.16) 
and with the additional relation 
ej = (ej — en) + en. (8.17) 


As in the So(2n; C) case, consecutive roots in the list (8.14) have an angle of 
27/3, whereas nonconsecutive roots on the list (8.14) are orthogonal. Meanwhile, 
the root in (8.15) has an angle of 32/4 with the last root in (8.14) and is orthogonal 
to the remaining roots in (8.14). 

In Sect. 8.2, we have pictured the B2 root system rotated by 2/4 relative to 
the n = 2 case of the root system described in this subsection. The pictures in 
Sect. 8.2 actually correspond to the conventional description of the C2 root system 
(Sect. 8.10.4), which is isomorphic to B2. 


8.10.4 The C„ Root System 


The C, root system is associated to the Lie algebra sp(n; C). For this root system, 
E = R” and the roots are the vectors of the form 


te;te, j<k 
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and of the form 
42e;, j=1,...,n. 
As a base, we may take the n — 1 roots 
ei — €2, €2 — €3,...,€n—-1 — En (8.18) 


(as in the two preceding subsections), together with the root 2e,. We use the same 
formula for expanding roots in terms of the base as in the case of So(2n + 1;C), 
except that (8.17) is rewritten as 


2e; = 2(e; — en) + (2en). 


The angle between two consecutive roots in (8.18) is 27/3; nonconsecutive roots 
in (8.18) are orthogonal. The angle between 2e, and the last root in (8.18) is 32/4; 
the root 2e,, is orthogonal to the other roots in (8.18). 


8.10.5 The Classical Dynkin Diagrams 


From the calculations in the previous subsections, we can read off the Dynkin 
diagram for the root systems A,, Bn, Cn, and D,; the results are recorded in 
Figure 8.22. We can see that certain special things happen for small values of n. 
First, the Dynkin diagram for D, does not make sense when n = 1, since the 
diagram always has at least two vertices. This observation reflects the fact that 
the Lie algebra so(2; C) is not semisimple. Second, the Dynkin diagram for D2 
is not connected, which means (Corollary 8.33) that the Lie algebra so(4; C) is 
semisimple but not simple. (Compare Exercise 4 in Chapter 7.) Third, the Dynkin 
diagrams for A,, B,, and C; are isomorphic, reflecting that the rank-one Lie algebras 
sl(2;C), so(3; C), and sp(1; C) are isomorphic. Last, we have an isomorphism 
between the diagrams for Bz and C2 and an isomorphism between the diagrams for 


o=o- tee o o see —o— o> 


=p gui =a 6 oe < 


Fig. 8.22 The Dynkin diagrams of the classical Lie algebras 
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A3 and D3, which reflects (Sect. 8.11) an isomorphism between the corresponding 
Lie algebras. 


Corollary 8.47. The following semisimple Lie algebras are simple: the special 
linear algebras s\(n + 1;C), n > 1; the odd orthogonal algebras so(2n + 1;C), 
n > 1; the even orthogonal algebras so(2n; C), n > 3; and the symplectic algebras 
sp(n; C), 2 > 1. 


Proof. The Dynkin diagrams for A,,, Bn, and C, are always connected, whereas the 
Dynkin diagram for D, is connected for n > 3. Thus, Corollary 8.33 shows that the 
claimed Lie algebras are simple. o 


8.11 The Classification 


In this section, we describe, without proof, the classification of irreducible root 
systems and of simple Lie algebras. Recall (Corollary 8.33) that a semisimple Lie 
algebra is simple if and only if its Dynkin diagram is connected. 

Every irreducible root system is either the root system of a classical Lie algebra 
(types An, Bn, Cn, and D,,) or one of five exceptional root systems. We begin by 
listing the Dynkin diagrams of the exceptional root systems. 


Theorem 8.48. For each of the graphs in Figure 8.23, there exists a root system 
having that graph as its Dynkin diagram. 


We have already described the root system Gz in Figure 8.3. Although it is 
possible to write down the remaining exceptional root systems explicitly, it is not 
terribly useful to do so, since there is no comparably easy way to construct the Lie 
algebras associated to these root systems. 


© | O O O—o>=0—o 
E¢ F4 

O | o S=0 
E7 G2 

O | O 
Es 


Fig. 8.23 The exceptional Dynkin diagrams 
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Theorem 8.49. Every irreducible root system is isomorphic to exactly one of the 
following: 


e A,n>1 
e Bin >2 
e C,n>3 
e D,n>4 
* One of the exceptional root systems G2, F4, E6, E7, and Eg. 


The restrictions on n are to avoid the case of D2, which is not irreducible, and 
to avoid repetitions. The Dynkin diagram for D3, for example, is isomorphic to the 
Dynkin diagram for A3, which means (Proposition 8.32) that the A; and D3 root 
systems are also isomorphic. Similarly, all root systems in rank one are isomorphic, 
and By is isomorphic to C3. 


Theorem 8.50. /. If g is a complex semisimple Lie algebra and hı and bz are 
Cartan subalgebra of g, there exists an automorphism @ : g —> g such that 
$(61) = $ (b2). 

2. Suppose g, and go are semisimple Lie algebras with Cartan subalgebras b; 
and hz, respectively. If the root systems associated to (g1, 91) and (g2, b2) are 
isomorphic, then gı and go are isomorphic. 

3. For every root system R, there exists a semisimple Lie algebra g and a Cartan 
subalgebra h of g such that the root system of g relative to h is isomorphic to R. 


Point | of the theorem says that there is only one root system associated to each 
semisimple Lie algebra g. Since all bases of a fixed root system are equivalent under 
W, it follows that there is only one Dynkin diagram associated to each g. Points 2 
and 3 then tell us that there is a one-to-one correspondence between isomorphism 
classes of semisimple Lie algebras and isomorphism classes of root systems. Thus, 
the classification of irreducible root systems also gives rise to a classification of 
simple Lie algebras. 

For Point | of the theorem see Section 16.4 of [Hum] and for Point 2 see Section 
14.2 of [Hum]. For Point 3, one can proceed on a case-by-case basis, where the Lie 
algebras A,,, Bn, Cn, and D, have already been constructed as classical Lie algebras. 
The exceptional Lie algebras can then be constructed by special methods, as in [Jac] 
or [Baez]. Alternatively, one can construct all of the simple Lie algebras by a unified 
method, as in Section 18 of [Hum]. 

In particular, the isomorphisms among root systems of small rank translate 
into isomorphisms among the associated semisimple Lie algebras. In rank one, 
for example, sl(2;C), so(3;C), and sp(1; C) are isomorphic. In rank two, the 
isomorphism between By and C} reflects an isomorphism of the Lie algebras 
so(5; C) and sp(2; C). In rank three, the isomorphism between A3 and D3 reflects 
an isomorphism of the Lie algebras sl(4; C) and so(6; C). 

By combining the classification of irreducible root systems in Theorem 8.49 
with Proposition 8.32 and Theorem 8.50, we arrive at the following classification 
of simple Lie algebras over the field of complex numbers. 
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Theorem 8.51. Every simple Lie algebra over C is isomorphic to precisely one 
algebra from the following list: 


.sln+1;C)n>1 

. $o(2n + 1;C),n > 2 

sp(n; C), n > 3 

. $0(2n;C),n > 4 

. The exceptional Lie algebras G2, F4, Eo, E7, and Eg 


MRWNS 


A semisimple Lie algebra is then determined up to isomorphism by specifying 
which simple summands occur and how many times each one occurs; see 
Proposition 7.9. It is also possible to classify simple Lie algebras over R. As 
we showed in Sect. 7.6, every such algebra is either a complex simple Lie algebra, 
viewed as a real Lie algebra, or a real form of a complex simple Lie algebra. Real 
forms of complex Lie algebras can then be enumerated using the Dynkin diagram 
of the complex Lie algebra as a starting point. See Section VI.10 and Appendix C 
of [Kna2] for a description of this enumeration. 


8.12 Exercises 


Unless otherwise noted, the notation in the exercises is as follows: (Æ, R) is a root 
system with Weyl group W, A = {a,...,a,} is a fixed base for R, Rt is the 
associated set of positive roots, and C is the open fundamental chamber associated 
to A. 


1. (a) Suppose that œ and £ are linearly independent elements of R and that for 
some positive integer k, the vector a + kB belongs to R. Show that a + 16 
also belongs to R for all integers / with O < I < k. 
Hint: If F = span(«, $), then R N F is arank-two root system in F. 

(b) A collection of roots of the form a,a + 6,...,a@ +6 for which neither 
a—B nora+(k+1)f6 is aroot is called a root string. What is the maximum 
number of roots that can occur in a root string? 

2. Let E be a finite-dimensional real inner product space and let V;,..., Vp be 
subspaces of E of codimension at least one. Show that the union of the V;,’s is 
not all of £. 

Hint: Show by induction on k that the complement of the union the V;’s is a 

nonempty open subset of E. 

3. Let E be a finite-dimensional real inner product space, let {v,,..., V, } be basis 


for E, and let {v7 , . . . , v*} the dual basis, satisfying (v7, vk) = ĉj for all j and 
k. Let G and H be the Gram matrices for these two bases: Gj. = (vj, vk) and 


Ay = (vs, vf). Show that G and H are inverses of each other. 
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10. 


11. 
12. 


Hint: First show that for any u € E, we have u = >); (vf, u) v;. Then apply 


this result to the vector u = Up. 


. Show that Lemma 8.26 fails (even with u = e) if 6 is not assumed to be an 


element of Ac. 


. Show that if R is an irreducible root system, then W acts irreducibly on Æ. 


Hint: Suppose V C E is a W-invariant subspace. Show that every element of 
R is either in V or in the orthogonal complement of E. 


. Let (E, R) be an irreducible root system and let (-,-) be the inner product on £. 


Using Exercise 5, show that if (-,-); is a W-invariant inner product on Æ, there 
is some constant c such that (H, H’), = c (H, H’) forall H, H’ € E. 

Hint: Consider the unique linear operator A : E —> E that is symmetric with 
respect to (-,-) and that satisfies 


(H, H'), = (HAH 


for all H, H’ € E. Then imitate the proof of Schur’s lemma, noting that the 
eigenvalues of A are real. 


. Suppose (E, R) and (F, S) are irreducible root systems and that A : E > F is 


an isomorphism of R with S. Show that A is a constant multiple of an isometry. 


. Using the outline below, prove the following result: For all u, À € E, we have 


u > A if and only if (u — à, H} > Oforall H EC. 


(a) Show that if u > A, then (u — À, H} > 0 forall H € C. 


(b) Let {af,...,@*} be the dual basis to A, satisfying a a) = bj for all j 


and k. Show that if (y, H) > 0 forall H € C, then (y, až) > 0 forall j. 
(c) Show that if (u — à, H) > 0 for all H € C, then u > À. 


. Let P : E > R be the function given by 


P(H)= || (œH). 


aeRt 


Show that P satisfies 
P(w- H) = det(w) P(A) 


for all w € W andall H € E. 

Show that if —J is not in the Weyl of R, the Dynkin diagram of R must have a 
nontrivial automorphism. 

Hint: By Proposition 8.23, there exists an element w of W mapping —C to C. 
Consider the map H +» —w - H, which maps C to itself. 

For which rank-two root systems is —J an element of the Weyl group? 

Show that the Weyl group of the A, root system, described in Sect. 7.7.1, does 
not contain —/, except when n = 1. 
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13. Let E = R” and let R denote the collection of vectors of the following three 
forms: 


tejte, j<k 
te; j=l,...,n. 
2e; j=l,...,n 


Show that R satisfies all the properties of a root system in Definition 8.1 except 
Condition 2. The collection R is a “nonreduced root system” and is known as 
BC,, since it is the union of B, and C,. (Compare Figure 7.1 in then = 2 
case.) 

14. Determine which of the Dynkin diagrams in Figures 8.22 and 8.23 have a 
nontrivial automorphism. Show that only the Dynkin diagram of D4 has an 
automorphism group with more than two elements. 


Chapter 9 
Representations of Semisimple Lie Algebras 


In this chapter, we prove the theorem of the highest weight for irreducible, finite- 
dimensional representations of a complex semisimple Lie algebra g. We first 
prove that every such representation has a highest weight, that two irreducible 
representations with the same highest weight are isomorphic, and that the highest 
weight of an irreducible representation must be dominant integral. This part of 
the theorem is established in precisely the same way as in the case of sl(3;C) 
in Chapter 6. It then remains to prove that every dominant integral element is, in 
fact, the highest weight of some irreducible representation. In the sl(3; C) case, 
we did this by first constructing the representations whose highest weights were 
the fundamental weights (1,0) and (0, 1), and then taking tensor products of these 
representations. For a general semisimple Lie algebra g, however, there is no simple 
way to construct the representations whose highest weights are the fundamental 
weights in Definition 8.36. Thus, we require a new method of constructing the 
irreducible representation of g with a given dominant integral highest weight. This 
construction will be the main topic of the present chapter. 

In Chapter 10, we will derive several additional properties of the irreducible 
representations, including the structure of the set of weights, the multiplicities of 
the weights, and the dimensions of the representations. In that chapter, we will 
also prove complete reducibility for representations of g, that is, that every finite- 
dimensional representation of g decomposes as a direct sum of irreducibles. 


9.1 Weights of Representations 


Throughout the chapter, we assume that g = €c is acomplex semisimple Lie algebra 
and that h = tc is a fixed Cartan subalgebra of g (compare Proposition 7.11). We fix 
on g an inner product that is real on € and that is invariant under the adjoint action 
of £, as in Proposition 7.4. We let R C it denote the set of roots of g relative to 5, 
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we let A be a fixed base for R, and we let R+ and R” be the set of positive and 
negative roots relative to A, respectively. For each root a, we consider the coroot 
Ay € b given by 


Hy =2— 


(æ, a) 


We also consider the Weyl group W, that is, the group of linear transformations 
of h generated by the reflections about the hyperplanes orthogonal to the roots. 
Finally, we consider the notions of integral and dominant integral elements, as in 
Definition 8.34. 

We now introduce the notion of a weight of a representation, as in the sl(3; ©) 
case. 


Definition 9.1. Let (x, V) be a representation of g, possibly infinite dimensional. 
An element A of h is a weight of x if there exists a nonzero vector v € V such that 


u(H)v = (A, H)v (9.1) 


for all H € b. The weight space corresponding to A is the set of all v € V 
satisfying (9.1) and the multiplicity of A is the dimension of the corresponding 
weight space. 


Throughout the chapter, we will use, without comment, Proposition A.17, which 
says that weight vectors with distinct weights are linearly independent. 


Proposition 9.2. If (2, V) is a finite-dimensional representation of g, every weight 
of m is an integral element. 


Proof. For each root a, let 5” = (Xa, Yu, Ha) = sl(2; C) be the subalgebra of g in 
Theorem 7.19. If v is a weight vector with weight À, then 


u(Hy)v = (A, Ha) v. 


Thus, by applying Point 1 of Theorem 4.34 to the restriction of x to s*, we see that 
(A, Ha) must be an integer, showing that À is integral. o 


Theorem 9.3. If (x, V) is a finite-dimensional representation of g, the weights of 
x and their multiplicities are invariant under the action of W on H. 


Proof. For each a € R, we may construct the operator 


1 (Xa) o7 (Ya n(Xa) 


Sy =e Je A 


If (a, H) = 0, then H will commute with both X« and Y, and thus with S,. On 
the other hand, by Point 3 of Theorem 4.34, we have Sar (Ha) S3! = —n (Ha). We 
see, then, that 


Sar (H)SZ' = 7 (Sa: H) 
forall H € b. 
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Suppose now that v is a weight vector with some weight A. Then 


n(H)S7'v = SZ 1(Sa -A)v 
= S (À, Sa: H) v 
= (s; +, H) Sy'v, 


showing that S7!v is a weight vector with weight s7! - 1. Thus, Sọ! maps the 
weight space with weight A into the weight space with weight s7! - A. Meanwhile, 
essentially the same argument shows that Sẹ maps the weight space with weight sz !- 
A into the weight space with weight À , showing that the two spaces are isomorphic. 
Thus, sy! - A is again a weight with the same multiplicity as à. Thus, the weights 
and multiplicities are invariant under each s% ' — sx and, thus, under W. oO 


We now state the “easy” part of the theorem of the highest weight for represen- 
tations of g. 


Theorem 9.4. /. Every irreducible, finite-dimensional representation of g has a 
highest weight. 

2. Two irreducible, finite-dimensional representations of g with the same highest 
weight are isomorphic. 

3. If (a, V) is an irreducible, finite-dimensional representation of g with highest 
weight u, then u is dominant integral. 


Proof. Enumerate the positive roots as a),...,@y. Choose a basis for g consisting 
of elements X|,..., Xy with X; € Daj, elements Y1,..., Yy with Y; € 9-a; > and 
a basis 1,..., H, for h. Then the proof of Proposition 6.11 from Chapter 6 carries 
over to the present setting, with only the obvious notational changes, showing that 
every irreducible, finite-dimensional representation of g has a highest weight. 

The proofs of Propositions 6.14 and 6.15 then also go through without change to 
show that two irreducible, finite-dimensional representations with the same highest 
weight are isomorphic. Finally, using the sl(2; C)-subalgebras in Theorem 7.19, we 
may follow the proof of Proposition 6.16 to show that the highest weight of a finite- 
dimensional, irreducible representation must be dominant integral. o 


We now come to the “hard” part of the theorem of the highest weight. 


Theorem 9.5. If u is a dominant integral element, there exists an irreducible, finite- 
dimensional representation of g with highest weight u. 


As we have noted, the method of proof of Proposition 6.17, from the sl(3; C) 
case, does not readily extend to general semisimple Lie algebras. The proof of 
Theorem 9.5 will occupy the remainder of this chapter. 
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Our goal is to construct, for each dominant integral element u € b, a finite- 
dimensional, irreducible representation of g with highest weight u. Our construction 
will proceed in two stages. The first stage consists of constructing an infinite- 
dimensional representation V, of g, known as a Verma module. This representation 
will not be irreducible, but will be a highest weight cyclic representation with 
highest weight u. We will construct V, as a quotient of the so-called universal 
enveloping algebra U(g) of g. In order to show that the highest weight vector 
in V, is nonzero, we will need to develop a structure result for U(g) known as 
the Poincaré—Birkhoff—Witt theorem (Theorem 9.9). Unlike the finite-dimensional 
representations of g, the weights of the Verma module are not invariant under the 
action of the Weyl group. 

The second stage in our construction consists of showing that when u is 
dominant integral, V,, has an invariant subspace W, for which the quotient space 
Va/ Wù, is finite dimensional and irreducible, but not zero. To establish the finite 
dimensionality of the quotient, we will show that when jz is dominant integral, 
the weights of V, / W,,, unlike those of V,, are invariant under the action of the 
Weyl group. Thus, each weight A of V,,/ W, is integral and satisfies w- A < u for 
all w in the Weyl group. It turns out that there are only finitely many A’s with this 
property. Since each weight À has finite multiplicity (even in V,,), it follows that 
V../ Wù is finite dimensional. 


Definition 9.6. A (possibly infinite-dimensional) representation (x, V) of g is 
highest weight cyclic with highest weight u € 6 if there exists a nonzero vector 
v E€ V such that (1) x(H)v = (u, A) v for all H € b, (2) x(X)v = 0 for all 
X € ge with œ € R*, (3) the smallest invariant subspace containing v is V. 


Note that y € 6 is not required to be integral. Although it will turn out that 
all finite-dimensional highest weight cyclic representations are irreducible, this 
is not the case in infinite dimensions. Furthermore, two highest weight cyclic 
representations with the same highest weight may not be isomorphic, unless both 
of them are finite dimensional. In this chapter, we will construct, for any u € b, 
a particular highest weight cyclic representation V, with highest weight u, known 
as a Verma module. The Verma module is the “maximal” highest weight cyclic 
representation with a particular highest weight, and it is always infinite dimensional, 
even when ju is dominant integral. 

In the sl(2;C) case, Verma modules can be constructed explicitly as follows. 
For any complex number u, construct an infinite-dimensional vector space V, with 
basis vo, v,,.... (The elements of V, are finite linear combinations of the v;’s.) We 
define an action of sI(2;C) on V, by the same formulas as in Sect. 4.6: 


Ty (Y )v; = Vj+1 


Ty (A )vj = (u —2j)vj 
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Ty (X)vo = 0 
Ty(X)vj = j(u — (j —1))v;j-1. 


Note that in Sect. 4.6, the vectors v; equaled zero for large j, whereas here the v;’s 
are, by definition, linearly independent. Direct calculation shows that these formulas 
do, in fact, define a representation of sI(2; C). 

When yu is a non-negative integer m, the space W, spanned by Vm+1, Um+2,... 18 
invariant under the action of sl(2;C). After all, this space is clearly invariant under 
the action of z,,(Y) and z,,(#7), and it is invariant under z,,(X) because (with 
u = m) we have 


Ty(X)Um41 = kK(m—m)vm = 0. 


Since W, is invariant, the quotient vector space V,,/W,, inherits a natural action 
of sl(2;C). This quotient space is then the unique finite-dimensional irreducible 
representation of sl(2; C) with highest weight u. 

In the case of general semisimple Lie algebra g, we would like to do something 
similar. Pick an basis consisting of vectors 


Y,,...,Y¥v, M,..., Hr, X1,..., XN, (9.2) 


as in the proof of Theorem 9.4. If V, is any highest weight cyclic representation 
with highest weight u and highest weight vector vo, then V, is spanned by products 
of the basis elements applied to vo. By the reordering lemma (Lemma 6.12), we can 
reorder any such product as a linear combination of terms in which the elements are 
in the order listed in (9.2). Once the basis elements are in this order, any term that 
contains any X;’s will give zero when applied to vo. Furthermore, in any term that 
does not have any X;’s, any factors of H; will simply give w(H;) when applied 
to vo. Thus, V, must be spanned by elements of the form 


Ty (V1 )" Wy Y2)" +++ Hy (Yn vo. (9.3) 


The idea of a Verma module is that we should proceed as in the sl(2;C) case 
and simply decree that the vectors in (9.3) form a basis for our Verma module. The 
weights of the Verma module will the consist of all elements of the form 


Hn —+++—Nyay, 


where each n; is a non-negative integer. (See Figure 9.1.) If we do this, then there 
is only one possible way that the Lie algebra g can act. After all, if we apply some 
Lie algebra element 7 (Z) to a vector as in (9.3), we can reorder the elements as in 
the previous paragraph until they are in the order of (9.2). Then, as we have already 
noted, any factors of 2,,(X;) give zero and any factors of 2, (4;) give constants. 
We will, thus, eventually get back a linear combination of elements of the form (9.3). 
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Fig. 9.1 The weights of the Verma module with highest weight u 


The difficulty with the above description of the Verma module is that it does 
not provide any reasonable method for checking that x, actually constitutes a 
representation of g. After all, unless g = sl(2;C), it is impossible to write 
down an explicit description of how the various basis elements act, and it is thus 
impossible to verify directly that these elements satisfy the correct commutation 
relations. Nevertheless, we will eventually prove (Sect.9.5) that there is a well- 
defined representation of g having the elements (9.3) as a basis and in which g acts 
in the way described above. In the case in which jz is dominant and integral, we will 
then construct an invariant subspace of the Verma module for which the quotient is 
finite dimensional and irreducible. 


9.3 Universal Enveloping Algebras 


In Sect.9.5, we will construct each Verma module V,, as a quotient of something 
called the universal enveloping algebra of a Lie algebra g. If g is a Lie algebra, we 
may try to embed g as a subspace of some associative algebra A in such a way that 
the bracket on g may be computed as [X, Y] = XY — YX, where XY and YX are 
computed in A. If g is the Lie algebra of a matrix Lie group G C Gl(n; C), then g 
is a subspace of the associative algebra M,,(C) and the bracket on g is indeed given 
as [X, Y] = XY — YX. There may be, however, many other ways to embed g into 
an associative algebra A. For example, if g = sl(2; C), then for each m > 1, the 
irreducible representation mm of g of dimension m + 1 allows us to embed g into 
Mm+i(C). 
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Let us now give a useful but slightly imprecise definition of the universal 
enveloping algebra of g, denoted U(g) For any Lie algebra g, the universal 
enveloping algebra of g will be an associative algebra A with identity with the 
following properties. (1) The Lie algebra g embeds into A in such a way that 
[X,Y] = XY — YX. (2) The algebra A is generated by elements of g, meaning 
that the smallest subalgebra with identity of A containing g is all of A. (3) The 
algebra A is maximal among all algebras A with the two previous properties. 
The maximality property of A will be explained more precisely in the discussion 
following Theorem 9.7. 

Consider, for example, the case of a one-dimensional Lie algebra g spanned by 
a single nonzero element X, which of course satisfies [X, X] = 0. Then U(g) 
should be an associative algebra with identity generated a single element X, in 
which case, U(g) must also be commutative. Now, any associative algebra A with 
identity generated by a single nonzero element X will satisfy Properties 1 and 2 
in the definition of the enveloping algebra. But for A to be maximal, there should 
be no relations between the different powers of X, meaning that p(X) should be a 
nonzero element of A for every nonzero polynomial p. In this case, then, we may 
take U(g) to be the algebra of polynomials in a single variable. 

Suppose the algebra g in the previous paragraph is a matrix algebra, meaning 
that X is a single n x n matrix. Contrary to what we might, at first, expect, the 
enveloping algebra U(g) will not coincide with the associative algebra with identity 
A generated by X inside M,(C). After all, for any X € M,(C), the Cayley- 
Hamilton theorem implies that there exists a nonzero polynomial p (namely, the 
characteristic polynomial of X) for which p(X) = 0. In U(g), by contrast, we have 
said that p(X) should be nonzero for all nonzero polynomials p. 

Actually, it follows from the PBW theorem (Theorem 9.9) that the universal 
enveloping algebra of any nonzero Lie algebra g is infinite dimensional. In fact, if X 
is any nonzero element of g, the elements 1, X, X”,... will be linearly independent 
in U(g). Thus, even if g is an algebra of matrices, U(g) cannot be isomorphic to a 
subalgebra of an algebra of matrices. 

We now give the formal definition of a universal enveloping algebra. 


Theorem 9.7. For any Lie algebra g, there exists an associative algebra with 
identity, denoted U(g), together with a linear map i : g — U(g) such that the 
following properties hold. (1) For all X,Y € g, we have 


i([X, Y]) = i(X)i(Y) —i(Y)i(X). (9.4) 


(2) The algebra U(g) is generated by elements of the form i(X), X € g, meaning 
that the smallest subalgebra with identity of U (g) containing every i (X) is U(g). (3) 
Suppose A is an associative algebra with identity and j : g > A is a linear map 
such that j([X, Y]) coincides with j(X)j(Y) — j(Y)J(X) for all X,Y € g. Then 
there exists a unique algebra homomorphism @ : U(g) —> A such that (1) = 1 
and such that @(i(X)) = j(X) forall X € g. 

A pair (U(g), i) with the preceding properties is called a universal enveloping 
algebra for g. 
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A simple argument (Exercise |) shows that any two universal enveloping algebras 
for a fixed Lie algebra g are “canonically” isomorphic. 

Let us define an enveloping algebra of g to be an associative algebra A together 
with a linear map j : g —> A as in Point 3 of the theorem, with the additional 
property that A is generated by elements of the form j(X), X € g. In this case, the 
homomorphism ¢ : U(g) > A as in Theorem 9.7 is surjective and A is isomorphic 
to the quotient algebra U(g)/ ker(ġ). Thus, the universal enveloping algebra U(g) 
of g has the property that every other enveloping algebra of g is a quotient of U(g). 
This property of U(g) is a more precise formulation of the maximality condition we 
discussed in the second paragraph of this subsection. 

The construction of U(g) is in some sense easy or “soft.” But for U(g) to be 
useful in practice (for example, in constructing Verma modules), we need a structure 
theorem for it known as the Poincaré—Birkhoff—Witt theorem. The Poincaré- 
Birkhoff—Witt theorem will, in particular, show that the map 7 in Theorem 9.7 is 
actually injective. (See also Exercise 2.) Once this is established, we will be able to 
identify g with its image under i and thus think of g as embedded into U(g). 


Proof of Theorem 9.7. The operation of tensor product on vector spaces is associa- 
tive, in the sense that U & (V @ W) is canonically isomorphic to (U @ V) ® W, 
with the isomorphism taking u ® (v @ w) to (u @ v) & w for each u € U, v € V, 
and w € W. We may thus drop the parentheses and write simply U & V & W and 
u&v&w, and similarly for the tensor product of any finite number of vector spaces. 
In particular, we will let V®* denote the k-fold tensor product V @ --- @ V. The 
0-fold tensor product V®° is defined to be C (or whatever field we are working 
over). 

For a Lie algebra g, let us first define the tensor algebra T (g) over g, which is 
defined as 


T(g) = Q. 
k=0 


In the direct sum, each element of T (g) is required to be a finite linear combinations 
of elements g®* for different values of k. We can make T(g) into an associative 
algebra with identity by defining 


(uy @ Uz @ +++ @ ug): (Vj ® v2 @ +++ @ vj) 
= Uy QUR: QUDUQ: QU (9.5) 


and then extending the product by linearity. That is to say, the product operation 
is the unique bilinear map of T(g) x T(g) into T(g) that coincides with the 
tensor product (9.5) on g®* x g8’. Since C @ g®* is naturally isomorphic to g®*, 
the identity element 1 € C = g®° is the multiplicative identity for T(g). The 
associativity of the tensor product assures that T (g) is an associative algebra. 
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We now claim that the algebra T(g) has the following property: If A is any 
associative algebra with identity and j : g —> A is any linear map, there exists an 
algebra homomorphism ¢ : T(g) — A such that @(1) = 1 and (X) = j(X) 
for all X € g C T(g). Note that this property differs from the desired property of 
U(g) in that 7 is an arbitrary linear map and does not have to have any particular 
relationship to the algebra structure of A. To construct ¢, we require that the 
restriction of $ to g®* to be the unique linear map of g®* into A such that 


(X1 @ +++ @ XK) = JX) F(X) (9.6) 


for all X1,..., Xg in g. (Here we are using the natural k-fold extension of the 
universal property of tensor products in Definition 4.13.) It is then simple to 
check that ¢ is an algebra homomorphism. Furthermore, if ¢ is to be an algebra 
homomorphism that agrees with j on g, then ¢ must have the form in (9.6). 

We now proceed to construct U(g) as a quotient of T (g). A two-sided ideal in 
T (g) is a subspace J of T (g) such that for all a € T(g) and £ € J, the elements œf 
and Ba belong to J. We now let J be the smallest two-sided ideal in T (g) containing 
all elements of the form 


XOY-YQX-[X,Y], X,Y €g. (9.7) 


That is to say, J is the intersection of all two-sided ideals in T (g) containing all 
such elements, which is, again, a two-sided ideal containing these elements. More 
concretely, J can be constructed as the space of elements of the form 


N 
X aj (Xj; @ Yj —¥; 8 X; —[Xj, YDB; 


j=l 


with X; and Y; in g and œ; and f; being arbitrary elements of T (g). 

We now form the quotient vector space T(g)/J, which is an algebra. If 
j:g—> A is any linear map, we can form the algebra homomorphism 
$: T(g) > Aas above. If j satisfies j([X,Y]) = j(X)JŒ) — j(Y)j(X), then 
the kernel of ¢@ will contain all elements of the form X & Y — Y @ X — [X,Y]. 
Furthermore, the kernel of an algebra homomorphism is always a two-sided ideal. 
Thus, ker(ġ) contains J. It follows that the map ¢ : T(g) — A factors through 
U(g) := T(g)/J, giving the desired homomorphism of U(g) into A. Since U(g) 
is spanned by products of elements of g, there can be at most one map @ with the 
desired property, establishing the claimed uniqueness of @. oO 


Proposition 9.8. If x : g — End(V) is a representation of a Lie algebra g 
(not necessarily finite dimensional), there is a unique algebra homomorphism 


T : U(g) > End(V) such that (1) = I and (X) = n(X) forall X € g C U(g). 
Proof. Apply Theorem 9.7 with A = End(V) and j(X) = 2(X). o 
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We now state the Poincaré—Birkhoff—Witt theorem—or PBW theorem, for 
short—which is the key structure result for universal enveloping algebras. Although 
the result holds even for infinite-dimensional Lie algebras, we state it here in the 
finite-dimensional case, for notational simplicity. The proof of the PBW theorem is 
in Sect. 9.4. 


Theorem 9.9 (PBW Theorem). /f g is a finite-dimensional Lie algebra with basis 
X1,...,Xx, then elements of the form 


i(X1)"1i (Xo)? i (XE), (9.8) 


where each nx is a non-negative integer, span U(g) and are linearly independent. 
In particular, the elements i(X,),...,i(X;) are linearly independent, meaning that 
the map i : g —> U(g) is injective. 


In (9.8), we interpret i(X;)™ as 1 ifn; = 0. Since, actually, i is injective, we 
will henceforth identify g with its image under and thus regard g as a subspace of 
U(g). Thus, we may now write X in place of i(X). In our new notation, we may 
write (9.4) as 


[X,Y] = XY — YX 
and we may write the basis elements (9.8) as 
be ane X. (9.9) 


It is straightforward to show that the elements in (9.8) span U(g); the hard part is to 
prove they are linearly independent. 


Corollary 9.10. If g is a Lie algebra and h is a subalgebra of g, then there is a 
natural injection of U() into U(g) given by mapping any product X,X2-++-Xwn of 
elements of h to the same product in U(g). 


Proof. The inclusion of h into g induces an algebra homomorphism of ¢ : U(h) > 
U(g). Let us choose a basis X1,..., Xx for h and extend it to a basis X1,..., XN 
for g. By the PBW theorem for h, the elements X7'' --- Xy _* form a basis for U (b). 
Then by the PBW theorem for g, the ponesponding demit of U(g) are aia 
independent, showing that ¢ is injective. 


9.4 Proof of the PBW Theorem 


It is notationally convenient to write the elements of the claimed basis for U(g) as 
i (X DiX) ++ i(X jn), (9.10) 


with jı < j2 < +- < jy, where we interpret the above expression as 1 if N = 0. 
The easy part of the PBW theorem is to show that these elements span U(g). The 
proof of this claim is essentially the same as the proof of the reordering lemma 
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(Lemma 6.12) in Chapter 6. Every element of the tensor algebra T(g), and hence 
also of the universal enveloping algebra U (g), is a linear combination of products of 
Lie algebra elements. Expanding each Lie algebra element in our basis shows that 
every element of U (g) is a linear combination of products of basis elements, but not 
(so far) necessarily in nondecreasing order. But using the relation XY—YX = [X,Y], 
we can reorder any product of basis elements into the desired order, at the expense 
of introducing several terms that are products of one fewer basis elements. These 
smaller products can then, inductively, be rewritten as a linear combination of terms 
that are in the correct order. 

It may seem “obvious” that elements of the form (9.10) are linearly independent. 
Note, however, that any proof of independence of these elements must make use of 
the Jacobi identity. After all, if g is a vector space with any skew-symmetric, bilinear 
“bracket” operation, we can still construct a “universal enveloping algebra” by the 
construction in Sect.9.3, and the elements of the form (9.10) will still span this 
enveloping algebra. If, however, the bracket does not satisfy the Jacobi identity, the 
elements in (9.10) will not be linearly independent. If they were, then, in particular, 
the map į : g — U(g) would be injective. We could then identify g with its image 
under i, which means that the bracket on g would be given by [X, Y] = XY — YX, 
where XY and YX are computed in the associative algebra U(g). But any bracket of 
this form does satisfy the Jacobi identity. 

We now proceed with the proof of the independence of the elements in (9.10). 
The reader is encouraged to note the role of the Jacobi identity in our proof. Let D 
be any vector space having a basis 


{UG jn} 


indexed by all nondecreasing tuples (j1,..., jv). We wish to construct a linear map 
y : U(g) —> D with the property that 


V(X AX jn X jy) = VG jy) 


for each nondecreasing tuple (jj,..., jy). Since the elements w(;,....,jy) are, 
by construction, linearly independent, if such a map y exists, the elements 
X ;,Xj,+++Xj, must be linearly independent as well. (Any linear relation among 
the X;,Xj;,---Xj,’s would translate under y into a linear relation among the 
Ure jy) S) 

Instead of directly constructing y, we will construct a linear map ô : T (g) > D 
with the properties (1) that 


5(Xj, @ Xp, BiB Xjy) = Vj jn) (9.11) 
for all nondecreasing tuples (j1,..., jy), and (2) that 6 is zero on the two-sided 


ideal J spanned by the elements in (9.7). Since ô is zero on J, it gives rise to a map 
a of U(g) := T(g)/J into D with the analogous property. 
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To keep our notation compact, we will now omit the tensor product symbol for 
multiplication in T (g). All computations in the remainder of the section are in T (g), 
so there will be no confusion. Suppose we can construct 6 in such a way that for all 


tuples (j1,..., jy) (not just for the nondecreasing tuples), we have 
(Xj 7 X kX jes a X jy = Xj i X irp X j te X jy) 
= (Xj Xie Xie Xin). (9.12) 


Then 6 will indeed by zero on J. After all, J is spanned by elements of the form 
a(X Y —YX)ß, and every such element can be expanded as a linear combination of 
elements of the form on the left-hand side of (9.12). 

Let the index of a monomial X; X;,--- Xj, be the number of pairs / < k for 
which j; > jx. We will construct 6 by induction first on the degree, n, of the 
monomial, and then by the index of the monomial for a given degree. For degrees 
O and 1, there is not much to do. Assume, now, that we have defined ô on all 
monomials of degree at most n — 1, and that it satisfies (9.12) whenever N <n—1. 
We now want to define 6 on monomials of degree n, starting with monomials of 
degree n and index 0. For these, we define 6 by (9.11). 

Now assume we have defined 6 on monomials of degree n and index p in such a 
way that (9.12) holds whenever both monomials on the left-hand side of (9.12) have 
index at most p. Note that if p = 0, this condition holds vacuously, since the two 
terms on the left-hand side cannot both have index 0 unless jx4; = jx, in which 
case both sides of (9.12) are zero. We now consider a monomial X; Xp Xj, 
of index p + 1 > 1. Since the index of the monomial is at least 1, the sequence 
(ji, J2, ---, Jn) is not weakly increasing, so there must be some j with jk > jk+1- 
Pick such a k and “define” 5 on the monomial by 


(X ji Xj,X Xin) SX j e XiX Xj) 


+ SX [Kis Xiegil e Xan). (9.13) 


Jk+1 ` 


Note that the first term on the right-hand side of (9.13) has index p and the second 
term on the right-hand side has degree n — 1, which means that both of these terms 
have been previously defined. 

The crux of the matter is to show that the value of 6 on a monomial of index 
p + 1 is independent of the choice of k in (9.13). Suppose, then, that there is some 
I < k such that jı > jj41 and jk > jk+1. We now proceed to check that the value 
of the right-hand side of (9.13) is unchanged if we replace k by /. 


Case 1: 1 < k—2. In this case, the numbers /,/ + 1, k, k + 1 are all distinct. Let 
us consider the two apparently different ways of calculating ô. If we use /, then 
we have 

B+ XI Xiti XXe) 
= (X141 X1 e Xk Xk+) + C [Xr Xrti] AeXegi +). (9.14) 
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Now, the second term on the right-hand side of (9.14) has degree n — 1. The first 

term has index p, and if we reverse X; and X;41 we obtain a term of index p—1. 

Thus, we can apply our induction hypothesis to reverse the order of Xg and X41 

in both terms on the right-hand side of (9.14), giving 

ôC- Xi Xiti e Xk Xk+) 
= 86+ Xp X1 Xei Xe) + 8C Xia Xp [Kes Xeyal) 
He ReMi Xia] X41 Xk) + 8C Xi, eal e Xk, Xea]. 

(9.15) 


(Note that on the right-hand side of (9.15), all terms have both X; and X;4, and 
Xk and Xk+; back in their correct PBW order, with X74, to the left of X; and 
X41 to the left of Xz.) 

But the right-hand side of (9.15) is symmetric in k and /, meaning that it is easy 
to check that we get the same result if we started with k instead of /. 


Case 2: l = k — 1. In this case, the indices j}, jj4; = Je, and jı+2 = jk+ı are in 
the completely wrong order, j; > j/41 > Jı+2. Let us use the notation X = X7, 
Y = X41, and Z = X42. We wish to show that the value of 6(---XYZ---) is 
the same whether we use / (that is, interchanging X and Y) or we use k (that is, 
interchanging Y and Z). 
If we interchange Y and Z, we get 


SC XYZ.) = SC RET OG X[Y, Zo 


We now further simplify the result, by induction, until the factors of X, Y, and 
Z are in their correct PBW order, which is ZYX. We obtain, then, 
8(--XYZ---) 
= 8(--- ZXY ---) + 8C- [X, Z]Y =) + 8C- X[Y, lS 
= ô(--ZYX---) + Oss Z[X,Y]--) 
+ b(--+[X, Z]Y +++) + 80- XY, Z]---). (9.16) 


Meanwhile, if we interchange X and Y, we get 


5(-+»XYZ-++) 
= ô(--YXZ---) +80- 1X, FIZ) 
= 8(--YZX---) +80- Y[X, Z] e) +80 X, Y]Z ) 
= ô(--ZYX---) +80- [¥, ZK) 
+80- Y[X, Z] e) +806 [X,Y]Z -). (9.17) 
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The two ways of computing have a common term of 6(---ZYX---). The 
remaining terms differ in that each commutator has the remaining factor on the 
other side, such as Z[X, Y] in (9.16) versus [X, Y]Z in (9.17). Subtracting gives 


6G ZX, Y]---)—8(--[X,Y]Z---) 
+8- |X, Z]Y ---)— CSF (KZ) 
HAC NY Zs) = SG [Y, ZN es) (9.18) 


Since all of the terms in (9.18) are of degree n — 1, we can apply our induction 
hypothesis to simplify the result to 


6([Z, [X,Y] + [IX Z], Y] + [X, [Y, ZI). 


But [[X, Z], Y] = [Y,[Z, X]], which mean that the argument of 6 in the above 
expression is zero, by the Jacobi identity. 


Once we have verified that the value of ô is independent of the choice of k in (9.13), 
it is clear that (9.12) holds, and we have completed the construction of the map ô. 


9.5 Construction of Verma Modules 


The proof will make use of the following definition: A subspace Z of U(g) is called 
a left ideal if «f € J for all a € U(g) and all $ € I. For any collection of vectors 
{a ;} in U(g), we may form the left ideal J “generated by” these vectors, that is, the 
smallest left ideal in U(g) containing each a;. The left ideal J is precisely the space 
of elements of the form 
X Bia; 
j 


with 8; being arbitrary elements of U (g). 
Let /,, denote the left ideal in U (g) generated by elements of the form 


H — (p, H)1, Heb (9.19) 
and of the form 
X Ega, WER’. (9.20) 
We now let W, denote the quotient vector space 
Wy = Ulg)/Ip, 


and we let [æ] denote the image of œ € U(g) in the quotient space. 
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We may define a representation x, of U(g) acting on W, by setting 


My ()([B]) = [ob] (9.21) 


for all œ and f in U(g). To verify that z,,(@) is well defined, note that if 8” is 
another representative of the equivalence class [8], then 6’ = B + y for some y in 
I. But then aB’ = af + ay, and ay belongs to /,,, because 7, is a left ideal. Thus, 
[«B’] = [f]. We may check that z,, is a homomorphism by noting that z7,,(a@8)[y] 
and z,,(@)z,,(B)[y] both equal [ay], by the associativity of U(g). The restriction 
of x, to g constitutes a representation of g acting on W,. 


Definition 9.11. The Verma module with highest weight jz, denoted W,, is the 
quotient space U(g)/J,,, where T, is the left ideal in U(g) generated by elements of 
the form (9.19) and (9.20). 


Theorem 9.12. The vector vo := [1] is a nonzero element of W, and W, is 
a highest weight cyclic representation with highest weight u and highest weight 
vector Vo. 


The hard part of the proof is establishing that vo is nonzero; this amounts to 
showing that the element 1 of U(g) is not in /,,. For purposes of constructing the 
irreducible, finite-dimensional representations of g, Theorem 9.12 is sufficient. Our 
method of proof, however, gives more information about the structure of W,,, which 
we will make use of in Chapter 10. 

Let nt denote the span of the root vectors Xy, E ge witha € R*, and let 
n” denote the span of the root vectors Yy € g-« with a € R*. Because [gw, 98] C 
So+p, both nt and n` are subalgebras of g. 


Theorem 9.13. If Y1,..., Yp forma basis for w, then the elements 
My (Vi)" Wy 2)" +++ (Ve) vo, (9.22) 


where each n; is a non-negative integer, form a basis for W,,. 


The theorem, together with the PBW theorem, tells us that there is a vector space 
isomorphism between U(n~) and W, given by œ +> z,,(a@)vo, where z, is the 
action of U(g) on W,,, given by (9.21). 


Lemma 9.14. Let J, denote the left ideal in U(6) C U(g) generated by elements 
of the form (9.19) and (9.20). Then 1 does not belong to J,,. 


Proof. Let 6 be the direct sum (as a vector space) of n* and h, which is easily seen 
to be a subalgebra of g. Let us define a one-dimensional representation o, of b, 
acting on C, by the formula 


ou(X + H) = (wu, H}. 
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(That is to say, o (X + H) is the 1 x 1 matrix with entry (u, H) .) To see that o, 
is actually a representation, we note that all 1 x 1 matrices commute. On the other 
hand, the commutator of two elements Z; and Z> of b will lie in nt, and Op is 
defined to be zero on n. Thus, o,,([Z1, Z2]) and [o,,(Z1), 6,(Z2)] are both zero. 
By Proposition 9.8, the representation o, of b extends to a representation ©, 
of U(b) satisfying o,(1) = 1. Now, the kernel of 6, is easily seen to be a left 
ideal in U(6), and by construction, the kernel of 6, will contain all elements of 
the form (9.19) and (9.20). But since 6,,(1) = 1, the element 1 does not belong to 
the kernel of G,,. Thus, ker(6,,) is a left ideal in U(b) containing all elements of the 
form (9.19) and (9.20), which means that ker(o,,) contains J,,. Since ker(d,,) does 
not contain 1, neither does J,,. oO 


Proof of Theorems 9.12 and 9.13. Note that for any H € b, we have 
(1, (H — (u, H) 1))vo = [H — (n, H) 1] = 0, 


because H — (u, H) 1 belongs to J,,. Thus, x, (HĦH)vo = (u, H) vo. Similarly, for 
any œ € R*, we have 


Tu(Xa)vo = [Xa] = 0. 


Since elements of U (g) are linear combinations of products of elements of g, any 
invariant subspace for the action of g on W, is also invariant under the action of U (g) 
on W,,. Suppose, then, that U is an invariant subspace of W, containing vo = [I]. 
Then for any [a] € W,, we have z,,(@)vo = [a], and so U = W,. Thus, vo is a 
cyclic vector for W,,. To prove Theorem 9.12, it remains only to show that vp Æ 0. 
The Lie algebra g decomposes as a vector space direct sum of n~ and b, where 
n` is the span of the root spaces corresponding to roots in R~ and where b is the 
span of h and the root spaces corresponding to roots in R*. Let us choose a basis 
Yi,..., Yk, Z1,..., Zı for g consisting of a basis Y;,..., Yp for n` together with a 
basis Z,,..., Z; for b. By applying the PBW theorem to this basis, we can easily 
see (Exercise 4) that every element a of U (g) can be expressed uniquely in the form 


CO 


ee ee fie ae age (9.23) 


where each dy,,..n, belongs to U(6) C U(g). (For each a € U(g), only finitely 
many of the an,n, S Will be nonzero.) 

Suppose now that œ belongs to /,,, which means that « is a linear combination 
of terms of the form (H — (u, H) 1) and BX, with £ in U(g), H in b, and Xa in 
ga, & € R+. By writing each £ as in (9.23), we see that « is a linear combination 
of terms of the form 


neces 


9.6 Irreducible Quotient Modules 257 


and 


ni yn2 nk 
Yı Y, -Y bn, CF 


with by,...n, in U(6). Note that by, .n,(H — (u, H) 1) and by, 
to the left ideal J, C U(b). Thus, if œ is in J,, each dy, 
expansion (9.23) of œ must belong to J,,. 

Now, by the uniqueness of the expansion, the only way the element a in (9.23) 
can equal | is if there is only one term, the one with n) = --- = nx = 0, and if 
o,...,0 = 1. On the other hand, for «œ to be in J,,, each an,n, must belong to Ju. 
Since (Lemma 9.14) 1 is not in J,,, we see that 1 is not in J,,, so that vo = [1] is 
nonzero in U(g)/J,. 

We now argue that the vectors in (9.22) are linearly independent in U(g)/J,.. 
Suppose, then, that a linear combination of these vectors, with coefficients Cy, ny 
equals zero. Then the corresponding linear combination of elements in U(g), namely 


as n da belong 
n, in the (unique) 


jaiot 


[o,@} 
a= > sas Serre faa eee ee 
Ny... Nk = 
belongs to J,,. But as shown above, for a to be in J,,, each of the constants Cy, ....n; 
must be in J,,. Thus, by Lemma 9.14, each of the constants Cn,,...n, 18 Zero. oO 


The main role of the PBW theorem in the preceding proof is to establish the 
uniqueness of the expansion (9.23). 


9.6 Irreducible Quotient Modules 


In this section, we show that every Verma module has a largest proper invariant 
subspace U,, and that the quotient space V, := W,,/U,, is irreducible with highest 
weight u. In the next section, we will show that if u is dominant integral, this 
quotient space is finite dimensional. 

It is easy to see (Exercise 6) that the Verma module W, is the direct sum of its 
weight spaces. It therefore makes sense to talk about the component of a vector 
v € W, in the one-dimensional subspace spanned by vo, which we refer to as the 
vo-component of v. As in the previous section, we let nt denote the subalgebra of 
g spanned by weight vectors Xy € ga, witha € RT. 


Definition 9.15. For any Verma module W,, let U, be the subspace of V, 
consisting of all vectors v such that the vg-component of v is zero and such that 
the vp-component of 


Ty (X1) +++ Ty (Xy)v 


is also zero for any collection of vectors X 1... XN innt. 
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That is to say, a vector v belongs to U,, if we cannot “get to” vo from v by 
applying “raising operators” X € gy, œ € R. Certainly the zero vector is in U,,; 
for some g’s and ju’s, it happens that U, = {0}. 


Proposition 9.16. The space U,, is an invariant subspace for the action of g. 


Proof. Suppose that v is in U,, and that Z is some element of g. We want to show 
that z,,(Z)v is also in U,,. Thus, we consider 


feu (Xen (X ny (Zu (9.24) 


and we must show that the vo-component of this vector is zero. Using the reordering 
lemma (Lemma 6.12), we may rewrite the vector in (9.24) as a linear combination 
of vectors of the form 


ry (Y!) nu Y aH!) +++ ty (A) ay (X!) + ey (X™)v, (9.25) 


where the Y’s are in n`, the H’s are in h, and the X’s are in nt. However, since v 
is in U,,, the vo-component of 


Hy (X ++ 1 (Xu (9.26) 


is zero; thus, this vector is a linear combination of weight vectors with weight lower 
than u. Then applying elements of h and n™ to the vector in (9.26) will only keep 
the weights the same or lower them. Thus, the vg-component of the vector in (9.25), 
and hence also the vg-component of the vector in (9.24), is zero. This shows that 
,(Z)v is, again, in U,. o 


Since U,, is an invariant subspace of W,,, the quotient vectors space W,/ U, 
carries a natural action of g and thus constitutes a representation of g. 


Proposition 9.17. The quotient space V, := W,/U, is an irreducible representa- 
tion of g. 


Proof. A simple argument shows that the invariant subspaces of the representation 
W,./U, are in one-to-one correspondence with the invariant subspaces of W, that 
contain U,,. Thus, proving that V, is irreducible is equivalent to showing that any 
invariant subspace of W, that contains U, is either U, or W,. Suppose, then, that 
X is an invariant subspace that contains U, and at least one vector v that is not 
in U,,. This means that X also contains a vector u = m,(X1)-++-,(Xx)v whose 
Vo-component is nonzero. 

We now claim that X must contain vo itself. To see this, we decompose u as 
a nonzero multiple of vo plus a sum of weight vectors corresponding to weights 
à Æ pw. Since A A pn, we can find H in § with (A, H) # (u, H) and then we 
may apply to u the operator x, (H) — (A, H} I. This operator will keep us in X 
and will “kill” the component of u that is in the weight space corresponding to 
the weight A while leaving the vp-component of u nonzero. We can then continue 
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applying operators of this form until we have killed all the components of u in 
weight spaces different from jz, giving us a nonzero multiple of vg. We conclude, 
then, that X contains vo and, therefore, all of W,,. Thus, any invariant subspace of 
W, that properly contains U,, must be equal to W,,. o 


Since for each u € U, the vo-component of u is zero, the vector vo is not 
in U,,. Thus, the quotient space W, / U, is still a highest weight cyclic representation 
with highest weight u and with highest weight vector being the image of vo in the 
quotient. 


Example 9.18. Let a be an element of A and let $* = (Xo, Ya, Ha) be as in 
Theorem 7.19. If (u, Hy) is a non-negative integer m, then the vector v := 
1 (Ya) t" v9 belongs to U,. 


This result is illustrated in Figure 9.2. 


Proof. By the argument in proof of Theorem 4.32, the analog of (4.15) will hold 
here: 


1(Xq)H(Ya)! vo = jm — G = Dya Ya) vo. 
Thus, 


u(Xq)v = m(m — m)n(Yx)” vo = 0. 
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Fig. 9.2 Example 9.18 in the case œ = œ and m = 2. The vector v belongs to U, 
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Meanwhile, if B € RY with £ # a, then for Xg € gg, if m(Xg)v were nonzero, it 
would be a weight vector with weight 


à =u- (m+l1) +L. 


Since A is not lower than jz, we must have z (Xg)v = 0. Thus, the vo-component of 
v is zero and x(X,)v = 0 for all n € Rt, which implies that v is in U,. o 


9.7 Finite-Dimensional Quotient Modules 


Throughout this section, we assume that u is a dominant integral element. We will 
now show that, in this case, the irreducible quotient space V, := W,/U, is finite 
dimensional. Our strategy is to show that the set of weights for V, is invariant 
under the action of the Weyl group on b. Now, if jz is dominant integral, then every 
weight A of W,,—and thus also of V,,—must be integral, since u — A is an integer 
combination of roots. Hence, if the weights of V, are invariant under W, we must 
have w-A < u for all w € W. But it is not hard to show that there are only finitely 
many integral elements this property. We will conclude, then, that there are only 
finitely many weights in V,,. Since (even in the Verma module) each weight has 
finite multiplicity, this will show that V,,/U,, is finite dimensional. 

How, then, do we construct an action of the Weyl group on V,,? If we attempt to 
follow the proof of Theorem 9.3, we must contend with the fact that V,, is not yet 
known to be finite dimensional. Thus, we need a method of exponentiating operators 
on a possibly infinite-dimensional space. 


Definition 9.19. A linear operator X on a vector space V is locally nilpotent if for 
each v € V, there exists a positive integer k such that X*v = 0. 


If V is finite dimensional, then a locally nilpotent operator must actually be 
nilpotent, that is, there must exist a single k such that X‘v = 0 for all v. In 
the infinite-dimensional case, the value of k depends on v and there may be no 
single value of k that works for all v. If X is locally nilpotent, then we define e* to 
be the operator satisfying 


© yk 
X 
ev= J —v, (9.27) 


where for each v € V, the series on the right terminates. 


Proposition 9.20. For eacha € A, let s* = (Xv, Ya, Ha) be as in Theorem 7.19. 
If u is dominant integral, then X« and Y, act in a locally nilpotent fashion on the 
quotient space V,. 
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Proof. For any X € g, we use X as an abbreviation for the action of X on the 
quotient space V,,. We say that a vector in V,, is s*-finite if it is contained in a 
finite-dimensional, s*-invariant subspace. Let 


m = (u, Ha). 


which is a non-negative integer because jz is dominant integral. Let Up denote the 
image in V, of the highest vector vp € W,,, and consider the vectors 


i= Vy, & = 0,12... 


By the calculations in Sect. 4.6, the span of the v;’s is invariant under the action 
of s”. On the other hand, Example 9.18 shows that 2(Y¥,)"t!vo is in Up, which 
means that Ùm+1 = ymtlig is the zero element of V,,. Thus, (vo,...,0m) is a 
finite-dimensional, s*-invariant subspace (Figure 9.3). In particular, there exists a 
nonzero, $°-finite vector in V. 

Now let Ty C V, be the space of all s“-finite vectors, which we have just shown 
to be nonzero. We now claim that Tą is invariant under the action of g. To see 
this, fix a vector v in Ty and an element X of g. Let S be a finite-dimensional, 
s*-invariant subspace containing v and let S’ be the span of all vectors of the form 
Yw with Y € gand w € S. Then S” is finite dimensional, having dimension at most 
(dim g)(dim S). Furthermore, if Z € $“, then for all w € S, we have 


ZYw= YZw+[Z.YIw, 
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Fig. 9.3 Since the vector v is in U, (Figure 9.2), the circled weights span a 5° -invariant subspace 
of W,/ U, 
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which belongs to S’, because Zw is again in S. Thus, S’ is also invariant under 
the action of 5”. We see, then, that Xv is contained in the finite-dimensional, s*- 
invariant subspace S”; that is, Xve T,. Since V, is irreducible and Tą is nonzero 
and invariant under the action of g, we have Ty = V,.. 

We conclude that every v € V, is contained in a finite-dimensional, s“-invariant 
subspace. It then follows from Point 2 of Theorem 4.34 that (Xa) v= (Yy)* v=0 
for some k, showing that X, and Ý, are locally nilpotent. o 


Proposition 9.21. If ņ is dominant integral, the set of weights for V, is invariant 
under the action of the Weyl group on b. 


Proof. We continue the notation from the proof of Proposition 9.20. Since (Propo- 

sition 8.24) W is generated by the reflections sy with w € A, it suffices to show that 

the weights of V,,/U,, are invariant under each such reflection. By Proposition 9.20, 

X, and Ý, are locally nilpotent, and thus it makes sense to define operators Sy by 
Sy = erte Me ea, 

We may now imitate the proof of Theorem 9.3 as follows. If H € h satisfies 
(a, H} = 0, then [H, X,] = [H, Ya] = 0, which means that H commutes with 
X and Y, and, thus, with S,. Meanwhile, for any v € V,,, we may find a finite- 
dimensional, s*-invariant subspace S containing v. In the space S, we may apply 
Point 3 of Theorem 4.34 to show that 


Safa Sy! v = —Hyv. 


We conclude that for all H € b, we have 


SÄ S7’ = sa H. 


From this point on, the proof of Theorem 9.3 applies without change. o 


Figure 9.4 illustrates the result of Proposition 9.21 in the case of sl(2;C) and 
highest weight 3. If v is a weight vector with weight —5, then 7(X)v = 0, 
by (4.15) in the proof of Theorem 4.32. Thus, the span of the weight vectors with 
weights / < —5 is invariant. The quotient space has weights ranging from —3 to 3 
in increments of 2 and is, thus, invariant under the action of W = {J, —J}. 


Fig. 9.4 In the Verma module for sl(2; C) with highest weight 3, the span of the weight vectors 
with weights —5, —7,..., is invariant 
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We are now ready for the proof of the existence of finite-dimensional, irreducible 
representations. 


Proof of Theorem 9.5. The quotient space V, := W,/U,, is irreducible and has 
highest weight u. Every weight A of V, is integral and satisfies A < p. By 
Proposition 9.21, the weights are also invariant under the action of W, which means 
that every weight A satisfies w - À < yw for all w € W. Thus, by Proposition 8.44, A 
must be in the convex hull of the W-orbit of u, which implies that ||A|| < Iall. 
Since there are only finitely many integral elements 4 with this property, we 
conclude that V,, has only finitely many weights. 

Now, V, has at least one weight, namely jz. Since V, is irreducible, it must be 
the direct sum of its weight spaces. Now, since the elements in (9.22) form a basis 
for the Verma module W,, the corresponding elements of V,, certainly span V,. 
But for a given weight À, there are only finitely many choices of the exponents 
nı, ..., Nk in (9.22) that give a weight vector with weight À. (After all, if any of the 
n,;’s is large, the weight of the vector in (9.22) will be much lower than jz.) Thus, 
each weight of V, has finite multiplicity. Since, also, there are only finitely many 
weights, V,, is finite dimensional. o 


9.8 Exercises 


1. Suppose (i, U(g)) and (i’, U’(g)) are algebras as in Theorem 9.7. Show that there 
is an isomorphism ® : U(g) —> U’(g) such that ®(1) = 1 and such that 


O(i(X)) = i'(X) 


for all X € g. 
Hint: Use the defining property of U (g) to construct ®. 

2. Suppose that g C M,,(C) is a Lie algebra of matrices (with bracket given by 
XY — YX). Prove, without appealing to the PBW theorem, that the mapi : g > 
U(g) in Theorem 9.7 is injective. 

3. Suppose g is a Lie algebra and þh is a subalgebra. Apply Theorem 9.7 to the Lie 
algebra h with A = U(g) and with j being the inclusion of h into g C U(g). 
If @ : U(h) — U(g) is the associated algebra homomorphism, show that ¢ is 
injective. 

Hint: Use the PBW theorem. 

4. Using the PBW theorem for b (applied to the basis Z;,..., Z1) and for g (applied 

to the basis Yj,...,¥%, Z,..., Z1), establish first the existence and then the 
uniqueness of the expansion in (9.23). 
Hint: For the uniqueness result, first prove that if œ is a nonzero element of U (b), 
then Y;"'--- Y;'*a is a nonzero element of U(g), for any sequence n1, ..., ng of 
non-negative integers. Then prove that a linear combination as in (9.23) cannot 
be zero unless each of the elements dn, in U(6) is zero. 
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5. Let u be any element of h and let W, := U(g)/J, be the Verma module with 
highest weight jz. Now let o, be any other highest weight cyclic representation 
of g with highest weight u, acting on a vector space W,,. Show that there is a 
surjective intertwining map ¢ of V, onto W,. 

Note: It follows that W, is isomorphic to the quotient space V,,/ ker(¢). Thus, 
V,, is maximal among highest weight cyclic representations with highest weight 
u, in the sense that every other such representation is a quotient of V,,. 

Hint: If 6, is the extension of o, to U(g), as in Proposition 9.8, construct a map 
yw : U(g) > W, by mapping a € U(g) to ©, (œ)wo, where wo is a highest weight 
vector for W,,. 

6. Let W, := U(g)/I, be the Verma module with highest weight u and highest 
weight vector vo. Let X be the subspace of W, consisting of all those vectors 
that can be expressed as finite linear combinations of weight vectors. Show that 
X contains vo and is invariant under the action of g on W,,. Conclude that W, is 
the direct sum of its weight spaces. 

7. Let u be any element of h and let W, be the associated Verma module. Suppose 
À € § can be expressed in the form 


A = p= ni — +++ — NCA (9.28) 


where @,..., œp are the positive roots and 71,...,, are non-negative integers. 
Show that the multiplicity of A in W, is equal to the number of ways that u — A 
can be expressed as a linear combination of positive roots with non-negative 
integer coefficients. That is to say, the multiplicity of A is the number of k-tuples 
of non-negative integers (n1, ..., ng) for which (9.28) holds. 


Chapter 10 
Further Properties of the Representations 


In this chapter we derive several important properties of the representations we 
constructed in the previous chapter. Throughout the chapter, g = c denotes a 
complex semisimple Lie algebra, h = tc denotes a fixed Cartan subalgebra of g, 
and R denotes the set of roots for g relative to h. We let W denote the Weyl group, 
we let A denote a fixed base for R, and we let R* and RT denote the positive and 
negative roots with respect to A, respectively. 


10.1 The Structure of the Weights 


In this section, we establish the general version of Theorems 6.24 and 6.25 in the 
case of sl(3;C), which tells us which integral elements appear as weights of a 
fixed finite-dimensional irreducible representation. In Sect. 10.6, we will establish a 
formula for the multiplicities of the weights, as a consequence of the Wey] character 
formula. 

Recall that the weights of a finite-dimensional representation of g are integral 
elements (Proposition 9.2) and that the weights and their multiplicities are invariant 
under the action of W (Theorem 9.3). We now determine which weights occur in 
the representation with highest weight jz. Recall from Definition 6.23 the notion of 
the convex hull of a collection of vectors. 


Theorem 10.1. Let (x, V,) be an irreducible finite-dimensional representation of 
g with highest weight u. An integral element À is a weight of V, if and only if the 
following two conditions are satisfied. 


1. à belongs to the convex hull of the Weyl-group orbit of u. 
2. u — A can be expressed as an integer combination of roots. 
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Fig. 10.1 Typical weight diagram for the Lie algebra so(5; C) 


Figure 10.1 shows a typical example for the Lie algebra so(5; C). In the figure, 
the square lattice indicates the set of integral elements, the highest weight is circled, 
and the black dots indicate the weights of the representation. A number next to 
a dot indicates the multiplicity, with an unnumbered dot representing a weight of 
multiplicity 1. The multiplicities can be calculated using the Kostant multiplicity 
formula (Sect. 10.6). Note that the multiplicities do not have the sort of simple 
pattern that we saw in Sect. 6.7 for the case of sl(3;(C); that is, the multiplicities 
for So(5; C) are not constant on the “rings” in the weight diagram. 

We will use the notation W - u to denote the Weyl group orbit of an element ju, 
and the notation Conv(E) to denote the convex hull of E. The following result is 
the key step on the way to proving Theorem 10.1. 


Proposition 10.2. Let u be a dominant integral element. Suppose À is dominant, À 
is lower than u, and u — à can be expressed as an integer combination of roots. 
Then À is a weight of the irreducible representation with highest weight m. 


Lemma 10.3 (“No Holes” Lemma). Suppose (7x, V) is a finite-dimensional repre- 
sentation of g. Suppose that À is a weight of V and that (A, a) > 0 for some root æ. 
If we define j by 


j = (A, Ha) = 2 
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then à — ka is a weight of x for every integer k withO < k < j. In particular, À — œ 
is a weight of V. 


Note that since À is integral, j must be a (positive) integer. 


Proof. Let $y = (Xa, Ya, Ha) be the copy of sl(2; C) corresponding to the weight a 
(Theorem 7.19). Let U be the subspace of V spanned by weight spaces with weights 
n of the form 7 = A — ka fork € Z. Since X« and Yq shift weights by +a, the 
space U is invariant under $w. Note that since (a, Hy) = 2, we have 


(à — ka, Hy) = j — 2k. 


That is, the weight space corresponding to weight A — kg is precisely the eigenspace 
for (Hj) inside U corresponding to the eigenvalue j — 2k. 

By Point 4 of Theorem 4.34, if j > 0 is an eigenvalue for 7(H,,) inside U, then 
all of the integers j — 2k,0 < k < j, must also be eigenvalues for x (Hg) inside U. 
Thus, A — ka must be a weight of x for 0 < k < j. Since j is a positive integer, j 
must be at least 1, and, thus, à — œ must be a weight of z. oO 


Figure 10.2 illustrates the results of the “no holes” lemma. For the indicated 
weight A, the orthogonal projection of A onto œ equals (3/2), so that j = 3. Thus, 
A —a,A — 2a, and A — 3a must also be weights. 


Proof of Proposition 10.2. Since u — A is an integer combination of roots, u — A 
is also an integer combination of the positive simple roots a,,...,a@,. Since, also, 
À < pL, we have 


@------@------@ 


Fig. 10.2 Since À is a weight, each of the circled elements must also be a weight 
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H = À + > kj Qj 
j=l 
for some non-negative integers k,,...,k,. Consider now the following set P of 
integral elements, 
P= )n=A+) ljaj|0<1j <k;p. (10.1) 
j=l 


The elements of P form a discrete parallelepiped. 

We do not claim that every element of P is a weight of x, which is not, in 
general, true (Figure 10.3). Rather, we will show that if n 4 A is a weight of ~ in P, 
then there is another weight of V, in P that is “closer” to À. Specifically, for each 
element ņ of P as in (10.1), let L(y) = i l;. Starting from jz, we will construct a 
sequence of weights of V,, in P with decreasing values of L(7), until we reach one 
with L(7) = 0, which means that n = À. 

Suppose then that 7 is a weight of x in P with L(n) > 0. In that case, the second 
term in the formula for 7 is nonzero, and, thus, 


(Xiah) = So i{ Sova.) > 0. 
j=l k=1 k=1 


j=l 


Since each /; is non-negative, there must be some a; for which /;, > 0 and for which 


(Xaa) >0. 


j=l 


On the other hand, since À is dominant, (A, œx} > 0, and we conclude that (n, ax) > 
0. Thus, by the “no holes” lemma, 7 — a; must also be a weight of x. 

Now, since l% is positive, lę — 1 is non-negative, meaning that n — œx is still 
in P, where all the /;’s are unchanged except that /; is replaced by lx — 1. Thus, 
L(y — a) = L(y) — 1. We can then repeat the process starting with n — a, and 
obtain a sequence of weights of x with successively smaller values of L, until we 
reach L = 0, which corresponds to 7 = À. oO 


Figure 10.3 illustrates the proof of Proposition 10.2. Starting at u, we look for 
a sequence of weights of x in P. Each weight in the sequence has positive inner 
product either with a, or with a, allowing us to move in the direction of —a; or 
—a to another weight of x in P, until we reach À. 


Proof of Theorem 10.1. Let X C V, denote the span of all weight vectors whose 
weights differ from jz by a linear combination of roots. Then X is easily seen to be 
invariant under the action of g and X contains vo, so X = V,. Thus, every weight 
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Fig. 10.3 The thick line indicates a path of weights in P connecting u to A 


of x must satisfy Point 2 of Theorem 10.1. Furthermore, if A is a weight of x, then 
w-A < p forall w € W. Thus, by Proposition 8.44, A € Conv(W - u). 

Conversely, suppose that À satisfies the two conditions of Theorem 10.1. We 
can choose w € W so that A’ := w-A is dominant. Clearly, 4’ still belongs to 
Conv(W - u). Furthermore, since A is integral, w- A — A is an element of the root 
lattice. After all, the definition of integrality implies that s, - A — A is an integer 
multiple of œ; since the s,’s generate W, the result holds for all w € W. Thus, 
b— = w—A+A—A)D’ is an element of the root lattice, which means that 1’ 
also satisfies the two conditions of Theorem 10.1. Thus, by Proposition 10.2, A’ is a 
weight of x, which means that à = w™! - A’ is also a weight. Oo 


10.2 The Casimir Element 


In this section, we construct an element of U(g) known as the Casimir, which 
belongs to the center of U(g). The Casimir element is important in its own right 
and also plays a crucial role in the proof of complete reducibility (Sect. 10.3) and of 
the Weyl character formula (Sect. 10.8). 


Definition 10.4. Let X; be an orthonormal basis for £. Then the Casimir element 
C of U(g) is given by 


C=-) X. 
j 
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Proposition 10.5. 7. The value of C is independent of the choice of orthonormal 
basis for t. 
2. The element C is in the center of U(g). 


Proof. If {X;} and {Y;} are two different orthonormal bases for €, then there is an 
orthogonal matrix R such that 


Yj = J RyXz. 
k 


Then 


XOY? = Do RG Xe RyX1 
J 


jkl 


=X Y Reg( RX X 
kl j 

= = ôkiXk Xı 
kl 

= xX}. 
k 


This shows that C is independent of the choice of basis. 
Meanwhile, if {X;} is an orthonormal basis, let cj be the associated structure 
constants: 


[X;, Xz] = > Gari. 
1 


Note that for a fixed j, the matrix A’ given by (A/)j = Cj is the matrix 
representing the operator ady, in the chosen basis. Since the inner product on € 
is Ad- K -invariant, ad X; is a skew operator, which means that cj; is skew symmetric 
in k and/ for a fixed j. If we compute the commutator of some X; with C in U(g), 
we obtain 


[X;, C] = ) OLX), XZ] 
k 


= SOX), XlX + XlX; Xe) 
k 

= So eu XiXe + Y cuXkXı. (10.2) 
k,l k,l 


In the first sum in the last line of (10.2), we may reverse the labeling of the 
summation variables and use the skew symmetry of cjx in k and / to obtain 


10.2 The Casimir Element 271 


[X;,C] = deja + Cj) XıXk = 0. 
kl 


Thus, C commutes the each X;. But since U(g) is generated by elements of g, we 
see that C actually commutes with every element of U(g). o 


Let x be a finite-dimensional, irreducible representation of g. By Proposition 9.8, 
we can extend z to a representation of U(g), which we also denote by m. We 
now show z(C) is a constant multiple of the identity. The formula for the constant 
involves the element 6 (Definition 8.37), equal to half the sum of the positive roots. 
The element 6 also arises in our discussion of the Weyl character formula, the Wey] 
dimension formula, and the Kostant multiplicity formula. 


Proposition 10.6. Let (x, V) be a finite-dimensional irreducible representation of 
g (extended to U(g)) with highest weight u. Then we have 


WC)= =) ny = cyl, 


j 
where c, is a constant given by 
Cu = (u + 8, 4+ 4) — (6,4). 
Furthermore, c, > 0 with c, = 0 only if p = 0. 


Lemma 10.7. Let X € ga be a unit vector, so that X* € g-a. Then under our usual 
identification of h with h*, we have 


[X, X*] =a. 


Proof. According to Lemma 7.22, we have 
([X, X*], Ha) = (a, Ha) (X*, X*) = (a, Ha), (10.3) 


since X (and thus, also, X*) is a unit vector. On the other hand, we know that the 
commutator of any element of gẹ with any element of g_, is a multiple of Ha, which 
is (under our identification of h with h*) a multiple of œ. But if [X, X*] = ca, (10.3) 
tells us that c must equal 1. o 


Proof of Proposition 10.6. Since C is in the center of the universal enveloping 
algebra, z (C) commutes with each z(X), X € g. Thus, by Schur’s lemma, z (C) 
must act as a constant multiple c, of the identity operator. 

To compute the constant c„, we choose an orthonormal basis for € as follows. 
Take an orthonormal basis H),..., H; for t. Then for each œ € Rt, choose a unit 
vector Xq in ga, so that X% is a unit vector in g_,. Then the elements 


Yu = (Xa t X2)/(V2i); Za = (Xa — X3/V2, 
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satisfy Y* = —Y, and Z = —Z,, which shows that these elements belong to €. 
Since gy is orthogonal to g—g, it is easy to see that these vectors are also unit vectors 
and orthogonal to each other. The set of vectors of the form H;, 7 = 1,...,7r, and 
Y,,a@ € R*,and Z,,a € Rt, form an orthonormal basis for ¢. 

We compute that 


1 
Vg = 5 (X2 + Xa Xa + Xå Xa + (XZP) 


1 
Za = (Ka — Xa Xg — Xt Xa + XD), 
so that 
=Y? = ZZ = X,X* + X*X, 
= 2X5 Xa + [Xo, XŠ]. 
Thus, the Casimir element C may be computed as 
C= J OX} Xa + [Xa XD- oH. 
aeRt j=l 


Suppose now that v is a highest weight vector, and compute that 


r 


m(C)v =- n(Hj)?v+2 X a(XZ)x(Xa)v 


j=l aeRt 
+ JO alle, Xv. 
aeRt 


Since v is a highest weight vector, 7(X,)v = 0, and since H; € t C b, we have 
u(H;)v =i (u, H;) v. Using Lemma 10.7, we then see that 


r 


a(Cw=|Y (u Hi) + Y (ua) |v, (10.4) 


j=l aeRt 


where the coefficient of v on the right-hand side of (10.4) must equal c,,. 

Now, since {H}; ai is an orthonormal basis for t, the first term in the coefficient 
of v equals (jz, u}. Moving the sum over « inside the inner product in the second 
term gives 


Cu = (H, H) + (H, 28) , (10.5) 
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which is the same as (u + ô, u + ô) — (6,6). Finally, we note that since u is 
dominant, (u, œ} > 0 for every positive root, from which it follows that (41,26) = 
overt (H, @) is non-negative. Thus, c, > 0 for all u and c, > Oif u #0. o 


10.3 Complete Reducibility 


Let g be a complex semisimple Lie algebra. Then g is isomorphic to the complexi- 
fication of a the Lie algebra of a compact matrix Lie group K. (This is, for us, true 
by definition; see Definition 7.1.) Actually, it is possible to show that there exists 
a simply connected compact Lie group K with Lie algebra € such that g = tc. 
(This claim follows from Theorems 4.11.6 and 4.11.10 of [Var].) Assuming this 
result, we can see that every finite-dimensional representation m of g gives rise 
to a representation of K by restricting m to € and then applying Theorem 5.6. 
Theorem 4.28 then tells us that every finite-dimensional representation of g is 
completely reducible; compare Corollary 4.11.11 in [Var]. 

Rather than relying on the existence of a simply connected K, we now give 
an algebraic proof of complete reducibility. Our proof makes use of the Casimir 
element and, in particular, the fact that the eigenvalue of the Casimir is nonzero in 
each nontrivial irreducible representation. 


Proposition 10.8. [f (7,V) is a one-dimensional representation of g, then 
u(X) = 0 forall X €g. 


Proof. By Theorem 7.8, g decomposes as a Lie algebra direct sum of simple 
algebras g;. Since the kernel of T| 4, is a ideal, the restriction of x to g; must be 
either zero or injective. But since dim g; > 2 and dim(End(V )) = 1, this restriction 
cannot be injective, so it must be zero for each j. oO 


Theorem 10.9. Every finite-dimensional representation of a semisimple Lie alge- 
bra is completely reducible. 


We begin by considering what appears to be a very special case of the theorem. 


Lemma 10.10. Suppose (x, V) is a finite-dimensional representation of g and that 
W is an invariant subspace of V of codimension 1. Then V decomposes as W @ U 
for some invariant subspace U of V. 


Proof. We consider first the case in which W is irreducible. If W is one- 
dimensional, then by Proposition 10.8, the restriction of x to W is zero. Since W 
is one-dimensional and has codimension 1, V must be two dimensional. The space 
of linear operators on V that are zero on W then has dimension 2. (Pick a basis for 
V consisting of a nonzero element of W and another linearly independent vector; 
in this basis, any such operator will have first column equal to zero.) On the other 
hand, each of the simple summands g; in the decomposition of g has dimension at 
least 3, since g; contains a subalgebra isomorphic to sI(2; C). Thus, the restriction 
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of x to g; cannot be injective and thus must be zero. We conclude, then, that x is 
identically zero, and we may take U to be any subspace of V complementary to W. 

Assume now that W is irreducible and nontrivial. Let C be the Casimir element 
of U(g) and let z(C) denote the action of C on V, by means of the extension of 7 
to U(g). By Proposition 10.6, the restriction of (C) to W is a nonzero multiple c 
of the identity. On the other hand, since V/W is one dimensional, the action of g 
on V/W is trivial, by Proposition 10.8. Thus, the action of 7(C) on V/W is zero, 
from which it follows that z(C) must have a nonzero kernel. (If we pick a basis for 
V consisting of a basis for W together with one other vector, the bottom row of the 
matrix of z (C ) in this basis be identically zero.) Because 7 (C) commutes with each 
zx(X), this kernel is an invariant subspace of V. Furthermore, ker(C) N W = {0}, 
because 2(C) acts as a nonzero scalar on W. Thus, U := ker(C) is the desired 
invariant complement to W. 

We consider next the case in which W has a nontrivial invariant subspace W’, 
which is, of course, also an invariant subspace of V. Then W/W’ is a codimension- 
one invariant subspace of V/W’. Thus, by induction on the dimension of W, we 
may assume that W/W’ has a one-dimensional invariant complement, say Y / W”. 
Then W” is a codimension-one invariant subspace of Y. Since dim W’ < dim W, 
we may apply induction again to find a one-dimensional invariant complement U to 
W” in Y, so that Y = W’ @ U. Now, YN W = W' and U N W’ = 0, from which 
it follows that U N W = {0}. Thus, U is the desired complement to W in V. o 


Proof of Theorem 10.9. Let (x, V) be a finite-dimensional representation of g 
and let W be nontrivial invariant subspace of V. We now look for an invariant 
complement to W, that is, an invariant subspace U such that V = W @ U. If we 
can always find such an U, then we may proceed by induction on the dimension to 
establish complete reducibility. If A : V — W is an intertwining map, the kernel 
of A will be an invariant subspace of V. If, in addition, the restriction of A to W is 
injective, then ker(A) N W = {0} and, by a dimension count, V will decompose as 
W @ ker(A). Now, the simplest way to ensure that A is injective on W is to assume 
that Aly is a nonzero multiple of the identity (If, for example, the restriction of 
A to W is the identity, we may think of A as a “projection” of V onto W.) If W 
is irreducible, nontrivial, and of codimension 1, we may take A = z(C) as in the 
proof of Lemma 10.10. 

To construct A in general, we proceed as follows. Let Hom(V, W) denote the 
space of linear maps of V to W (not necessarily intertwining maps). This space can 
be viewed as a representation of g by means of the action 


X-A=n(X)A— An(X) (10.6) 


for all X € g and A € Hom(V, W). Let V denote the subspace of Hom(V, W) 
consisting of those maps A whose restriction to W is a scalar multiple c4 of the 
identity and let W denote the subspace of Hom(V, W) consisting of those maps 
whose restriction to W is zero. The map A +> c4 is a linear functional on V which 
is easily seen not to be identically zero. The space W is the kernel of this linear 
functional and is, thus, a codimension-one subspace of V. 
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We now claim that both V and W are invariant subspaces of Hom(V, W). To see 
this, suppose A € Hom(V, W) is equal to c7 on W. Then for w € W, we have 


(X - A)w = 1(X) Aw — Ar (X )w 
= 1(X)cw — cr (X)w 
— 0, 


because z (X )w is again in W, showing that X - A is actually in W. Thus, the action 
of g maps V into W, showing that both VY and W are invariant. 

Since W has codimension 1 in V, we are in the situation of Lemma 10.10. Thus, 
there exists an invariant complement U to W in V. Let A be a nonzero element of 
U. Since A is not in W, the restriction of A to W is a nonzero scalar. Furthermore, 
since M is one dimensional, Proposition 10.8 tells us that the action of g on A is zero, 
meaning that A commutes with each 1(X)). That is to say, A is an intertwining map 
of V to W. Thus, A is precisely the operator we were looking for: an intertwining 
map of V to W whose restriction to W is a nonzero multiple of the identity. Now, 
A maps V into W, and actually maps onto W, since A acts as a nonzero scalar on 
W itself. Thus, dim(ker(A)) = dim V — dim W and since ker(A) N W = {0}, we 
conclude that U := ker(A) is an invariant complement to W. 

We conclude, then, that every nontrivial invariant subspace W of V has an 
invariant complement U. By induction on the dimension, we may then assume 
that both W and U decompose as direct sums of irreducible invariant subspaces, 
in which case, V also has such a decomposition. oO 


10.4 The Weyl Character Formula 


The character formula is a major result in the structure of the irreducible represen- 
tations of g. Its consequences include a formula for the dimension of an irreducible 
representation (Sect. 10.5) and a formula for the multiplicities of the weights in an 
irreducible representation (Sect. 10.6). 

We now introduce the notion of the character of a finite-dimensional representa- 
tion of a group. 


Definition 10.11. Suppose (7x, V) is a finite-dimensional representation of a com- 
plex semisimple Lie algebra g. The character of z is the function X% : g > C 
given by 


¥n(X) = trace(e™™?). 


It turns out that the character of x encodes many interesting properties of x. We 
will give a formula for the character of an irreducible representation of a semisimple 
Lie algebra in terms of the highest weight of the representation. This section gives 
the statement of the character formula, Sects. 10.5 and 10.6 give consequences of it, 
and Sect. 10.8 gives the proof. 
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If the representation x of g = tc comes from a representation IT of the compact 
group K, then for X € £, we have y,(X) = trace(I(e*)). In Chapter 12, the group 
version of a character, namely, the function given by x +> trace(II(x)),x € K, 
plays a key role in the compact group approach to representation theory. 


Proposition 10.12. If (x, V) is a finite-dimensional representation of g, we have 
the following results. 


1. The dimension of V is equal to the value of Xn at the origin: 
dim(V) = x,(0). 


2. Suppose V decomposes as a direct sum of weight spaces V} with multiplicity 
m(A). Then for H € h we have 


p= ys awe™. (10.7) 
À 


Proof. The first point holds because trace(7) = dim V. Meanwhile, since 7(#) 
acts as (A, H) I in each weight space V}, the second point follows from the 
definition of y,. oO 


Example 10.13. Let x denote the irreducible representation of sl(2;C) of dimen- 
sion m + 1 and let 


Then 


Xx (GH) = Xx ((¢ a) 


= ema 4 elm—2a Hele ects ewe. (10.8) 
We may also compute 7, as 
sinh((m + l)a 
Xx (aH) = EA N (10.9) 
sinh(a) 
whenever a not an integer multiple of iz. 
Proof. The eigenvalues of z,,(H) are m, m—2, ...,—m , from which (10.8) follows. 


To obtain (10.9), we note that 
(e° — e") Xx (aH) 
= e(mt Da 4 elm—Da Pong e70" Da 


= elm—Na —(m—l)a e intDa 


—++--— e — 


= emt la _ pmnta (10.10) 
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so that 


efm+Da _ e—(m+Da sinh((m + 1)a) 
Xx (aH) = = = 3 
el — e~’ sinh (a) 


as claimed. o 


Note that in deriving (10.9) from (10.8), we multiplied the character by a cleverly 
chosen combination of exponentials (ef — e~“), leading to a large cancellation, so 
that only two terms remain in (10.10). The Weyl character formula asserts that 
we can perform a similar trick for the characters of irreducible representations of 
arbitrary semisimple Lie algebras. 

Recall that each element w of the Weyl group W acts as an orthogonal 
linear transformation of it C b. We let det(w) denote the determinant of this 
transformation, so that det(w) = +1. Recall from Definition 8.37 that 6 denotes 
half the sum of the positive roots. We are now ready to state the main result of this 
chapter. 


Theorem 10.14 (Weyl Character Formula). If (2, V,,) is an irreducible represen- 
tation of g with highest weight u, then 


Jye w det(w)e (w (u+8),H) 
X pew det(w)e (8-H) 


for all H € 6 for which the denominator is nonzero. 


Xn (FA) = 


(10.11) 


Since we will have frequent occasion to refer to the function in the denominator 
in (10.11), we give it a name. 


Definition 10.15. Let g :  — C be the function given by 
q(H) = > det(w)e "$m, 
wew 
The function q is called the Weyl denominator. 
The character formula may also be written as 
q(H)xx(H) = X. det(wye td), (10.12) 
wew 


Let us pause for a moment to reflect on what is going on in (10.12). The Weyl 
denominator g(#) is a sum of |W| exponentials with coefficients equal to +1. 
Meanwhile, the character y,, (#7) is a large sum of exponentials with positive integer 
coefficients, as in (10.7). When we multiply these two functions, we seemingly 
obtain an even larger sum of exponentials of the form 


el St 2H) 


for w € W and À a weight of x, with integer coefficients. 
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The character formula, however, asserts that most of these terms are not actually 
present. Specifically, the only exponentials that actually appear are those of the form 
e (wuts). l, which occurs with a coefficient of det(w). The point is that, in most 
cases, if a weight 7 can be written in the form 7 = w-6 + A, with A a weight of 
x, then 7 can be written in this form in more than one way. The character formula 
asserts that unless 7 is in the Weyl-group orbit of 6 + u, the coefficient of e"), 
after all the different contributions are taken into account, ends up being zero. 

By contrast, the weight n = ô+ u only occurs once, since it corresponds to taking 
the highest weight occurring in q (namely ô) and the highest weight occurring in y“ 
(namely the highest weight u of zr). (Note that by Propositions 8.38 and 8.27, the 
elements of the form w- ô are all distinct. Then by Proposition 8.42, w- < ô for all 
w.) Furthermore, since the weights occurring in both q and y” are Wey] invariant, 
the weight w - (6 + u) also occurs only once. The Weyl character formula, then, 
can be expressed as stating that if we compute the product qy,, a huge cancellation 
occurs: Every exponential in the product ends canceling out to zero, except for those 
that occur only once, namely those of the form w- (6 + u). 

In the case of sl(2;C), we have already observed how this cancellation occurs, 
in (10.10). The Weyl denominator is equal to e —e~™ in this case. Each exponential 
e! occurring in the product (e“—e~) y” (aH) will occur once with a plus sign (from 
e“e!—)2) and once with a minus sign (from e~%e"*+), except for the extreme 
cases l = +(m + 1). This cancellation occurs because the multiplicity of the weight 
l — 1 equals the multiplicity of the weight / + 1, namely 1. 

Figure 10.4, meanwhile, illustrates the case of the irreducible representation of 
sl(3;C) with highest weight (1,2). The top part of the figure indicates the six 
exponentials occurring in the Weyl denominator, with alternating signs. The middle 
part of the figure indicates the exponentials in the character of the representation 
with highest weight (1,2). The bottom part of the figure shows the product of the 
Weyl denominator and the character, in which only the six exponentials indicated 
by black dots survive. The white dots in the bottom part of the figure indicate 
exponentials that occur at least once in the product, but which end up with a 
coefficient of zero. 

The cancellation inherent in the Weyl character formula reflects a very special 
structure to the multiplicities of the various weights that occur. For an integral 
element À, each product of the form e(v'5.H) o(1H) where n = 4 —w-64, makes 
a contribution of det(w)mult(7) to the coefficient of e-”). Thus, if A is not in the 
Weyl orbit of u + 5, the Weyl character formula implies that 


X det(w)mult(A —w +d) =0, AGW: (u +8). (10.13) 
wew 


(In (10.13), some of elements of the form A — w- ô may not actually be weights of 
x, in which case, the multiplicity should be considered to be zero.) In the case of 
sl(3; C), the weights of the form A — w- ô form a small hexagon around the weight 
À. Figure 10.5 illustrates how the alternating sum of multiplicities around one such 
hexagon equals zero. 


10.4 The Weyl Character Formula 


Fig. 10.4 The product of the 


eyl denominator (top) and 
ct a 


e wi 
multiplici 


AED, 
ras 


ll see in Sect. 10.6 that the character formula leads to a formula for th 
ties of all the weights occurring in a particular irreducible representati 


Before concluding this section, we establish a technical result that we will use i 


the remainder of the chapter. 


exponential functions H ++ e®™, with à € b, are 
[0.0] 


Proposition 10.16. The 
c™(5) 


linearly independent in 
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Fig. 10.5 We compute the alternating sum of multiplicities around the hexagon enclosing the 
circled weight, beginning in the fundamental Weyl chamber and proceeding counterclockwise. 
The result is 1-1+1—2+2—-1=0 


The proposition means that if a function f € C™(h) can be expressed as a finite 
linear combination of exponentials, it has a unique such expression. 


Proof. We need to show that if the function f : h > C given by 

f(A) = cje H) ae Svat Cpe nH) 
is identically zero, where 4,,...,4, are distinct elements of h, then c) = --- = 
Cn = 0. Ifn = 1, we evaluate at H = 0 and conclude that cı must be zero. If n > 1, 


we choose, for each k = 2,...,n, some Hy € b such that (A1, Hk) # (Ax, Hk). 
Since f is identically zero, so is the function 


gi (Dm, _ (Ad, H)) sie (Du, = (An, H, Df 


where Dy denotes the directional derivative in the direction of X: 


d 
(Dy f)(A) = a + tX) 3 (10.14) 


t=0 
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Direct calculation then shows that 


g(H) = ) c; (Te Hy) — (Ak. mp) ern) 
k=2 


j=l 
=c] (Te Ak) — (Ak, mp) et H) (10.15) 
k=2 


By evaluating at H = 0 and noting that, by construction, the product in the second 
line of (10.15) is nonzero, we conclude that cı = 0. An entirely similar argument 
then shows that each c; = 0 as well. oO 


Corollary 10.17. Suppose f €e C™(b) can be expressed as a finite linear 
combination of exponentials e?") , with À integral, 


f(A) = soe", 
À 


If f satisfies f(w- H) = det(w) f (H), then 
Cwa = det(w)cy 


for each i occurring in the expansion of f. 


Proof. On the one hand, 


fw HY = Deg) =P og (10.16) 
À n 
On the other hand, 
fw- H) = det(w) f(H) = Ñ det(w)eye™™. (10.17) 


n 


By the linear independence of the exponentials, the only way the expansions 
in (10.16) and (10.17) can agree is if Cw. = det(w)c, for all 7. o 


10.5 The Weyl Dimension Formula 


Before coming to the proof of the Weyl character formula, we derive two important 
consequences of it, the Weyl dimension formula (described in this section) and the 
Kostant multiplicity formula (described in the next section). 

The dimension of a representation is equal to the value of the character at the 
identity (Proposition 10.12). In the Weyl character formula, however, both the 
numerator and the denominator are equal to zero when H = O. In the case of 
sl(2; C) case, for example, the character formula reads 
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sinh((m + 1)a) 


m H = 
K ua) sinh a 


The limit of this expression as 0 tends to zero may be computed by l’ Hospital’s rule 
to be m + 1, which is, of course, the dimension of Vn. 

In the general case, we will expand both numerator and denominator of the 
character formula in a power series. We will see that in both numerator and 
denominator, the first nonzero term has degree k, where 


k = the number of positive roots. 


To evaluate the limit of this expression at the origin, we will develop a version of 
l’Hospital’s rule. The limit is then computed as the ratio of a certain k-fold derivative 
of the numerator and the corresponding k-fold derivative of the denominator, 
evaluated at the origin. The result of this analysis is expressed in the following 
theorem. 


Theorem 10.18. If (x4, V„) is the irreducible representation of g with highest 
weight u, then the dimension of V, may be computed as 


Taert (a, H + ô) 
Taer+ (&, ô) 


Note that both u + ô and 6 are strictly dominant elements, so that all the factors 
in both the numerator and the denominator are nonzero. 

A function P on b is called a polynomial if for every basis of h, the function 
P is a polynomial in the coordinates z1, ..., z, associated to that basis. That is to 
say, P should be expressible as a finite linear combination of terms of the form 
Zi -++zir, where nı, ...,n, are non-negative integers. It is easy to see that if P 
is a polynomial in any one basis, then it is also a polynomial in every other basis as 
well. A polynomial P is said to be homogeneous of degree / if P(cH) = c! P(H) 


for all constants c and all H € b. 


dim(V,,) = 


Definition 10.19. Let P : h — C be the function given by 


P(H)= [| (œR). 


aeRt 


Note that P is a product of k linear functions and is, thus, a homogeneous 


polynomial of degree k. The dimension formula may be restated in terms of P as 
P(u +ô) 
dim(V,,) = ——.—.. 
im(V,,) PO) 


A key property of the polynomial P is its behavior under the action of the Weyl 
group. 
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Definition 10.20. A function f : h — C is said to be Weyl alternating if 


fw: H) = det(w) f(A) 


forallw € W and H €b. 


It is easy to see, for example, that the Weyl denominator q is Weyl alternating 
(Exercise 3). 


Proposition 10.21. 7. The function P is Weyl alternating. 
2. If f : b — Cis a Weyl-alternating polynomial, there is a polynomial g : h > C 
such that 


S(H) = P(H)g(H). 


In particular, if f is homogeneous of degree 1 < k, then f must be identically 
zero, and if f is homogeneous of degree k, then f must be a constant multiple 
of P. 


Proof. For any w € W, consider the collection of roots of the form w™! - œ for 
a € R*. Since R* contains exactly one element out of each pair +a of roots, the 


same is true of the collection of w`! -a’s, with w fixed and « varying over Rt. Thus, 


[| (@w-#) 


aeRT 


TI ta.) 


HERT 


C1 [] (œB), 


aeRt 


P(w- H) 


II 


where j is the number of negative roots in the collection {w7!- a},ert- 

Suppose first that w = w™! = sq, where g is a positive simple root. According 
to Proposition 8.30, sy, permutes the positive roots different from œ, whereas 
Sa < & = —a. Thus, j = —1 in this case, and so 


P(w- H) = —P(H) = det(w) P (H), 


since the determinant of a reflection is —1. By Proposition 8.24, every element w of 
W is a product of reflections associated to positive simple roots, and so 


P(w- H) = P(Sq;, +++ Sa; +H) 
= (-1)" P(H) 
= det(se;, +++ Sa;,)P(H), 


showing that P is alternating. 
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Suppose now that f is any Weyl-alternating polynomial. Then for any positive 
root a, if (a, H} = 0, we will have 


F(A) = fsa H) = —f(H), 


since det(s,) = —1. Thus, f must vanish on the hyperplane orthogonal to œ, which 
we denote as V,. It is then not hard to show (Exercise 4) that f is divisible in the 
space of polynomials by the linear function (a, H) , thatis, f(H) = (a, H) f'(A) 
for some polynomial f’. Now, if B is any positive root different from a, the 
polynomial f’ must vanish at least on the portion of Vg not contained in V,. But 
since f is not a multiple of a, Vg is distinct from Vy, so that Vg N Vy is a subspace 
of dimension r — 2. Thus, Vg — (Va N Vg) is dense in Vg. Since f ’ is continuous, it 
must actually vanish on all of Vz. 
It follows that f’ is divisible in the space of polynomials by (8, H} , so that 


f(A) = (a, H) (B, H) f" (A) 


for some polynomial f”. Proceeding on in the same way, we see that f contains a 
factor of (a, H) for each positive root a, meaning that 


f(H) =| [| @ A) |} si) 
aeRt 
= P(H)g(H) 
for some polynomial g, as claimed. o 


Recall the notion of directional derivative Dy, defined in (10.14). 


Lemma 10.22. Let A denote the differential operator 


A= || Du. 


aeRt 


For any À € b, let f} : b — C be the function given by 


AH) = DO det(we tm, 


wEW 
Then f, is Weyl alternating and is given by a convergent power series of the form 


fi (A) = c, P(A) + terms of degree at least k + 1 (10.18) 
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for some constant c}. Furthermore, (AP)(0) 4 0 and the constant c} may be 
computed as 


_ (AAO) 


= TAPO 


where 
(Af) O) = |W] P(A). 


Proof. The proof that f} is Weyl alternating is elementary. Since f} is a sum of 
exponentials, it is a real-analytic function, meaning that it can be expanded in a 
convergent power series in the coordinates x;,...,x, associated to any basis for 
b. In the power-series expansion of f}, we collect together all the terms that are 
homogeneous of degree /. Thus, f} is expressible as the sum of homogeneous 
polynomials q}, of degree /. Since f} is Weyl-alternating, it is not hard to show 
(Exercise 5) that each of the polynomials g); is also Weyl-alternating. Thus, by 
Proposition 10.21, all the polynomials q}, with / < k must zero, and the polynomial 
qı k Must be a constant multiple of P(H). This establishes the claimed form of the 
series for fy. 

On the one hand, applying A to a homogeneous term of degree / > k gives a 
homogeneous term of degree / — k > 0, which will evaluate to zero at the origin. 
Thus, 


(Af,)(O) = cx(AP)(O). 


On the other hand, by directly differentiating the exponentials in the definition of 
jy, we get 


AANO = X det(w) [| (w-A.a) 


wew aeRt 
= J det(w) P(w- A) 
wew 


= |W| P(). 
This shows that 
ca (APO) = |W| P(A). (10.19) 


Now, if A is strictly dominant, each factor in the formula for P(A) is nonzero, so 
that P(A) 4 0. Applying (10.19) in such a case shows that (AP)(0) 4 0. We can 
thus solve (10.19) for c} to obtain the claimed formula. oO 
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Proof of the Weyl Dimension Formula. As we have noted already, P(u + ô) and 
P(6) are nonzero, so that c,,45 and cs are also nonzero. For any H € b, we have 


Fu+s(tH) 

fs(tH) 
Cutst* P(A) + OGP 
cst P(A) + O(tk+!) 
Cu+sP(H) + O(t) 


= 10.2 
csP(H) + O(t) oe 


An (tH) = 


for any t for which the numerator of the last expression is nonzero. 

Now, we know from the definition of a character, that 7, is a continuous function. 
To determine the value of y” at the identity, we choose H to be in the open 
fundamental Weyl chamber, so that P(H) # 0, in which case the denominator 
in (10.20) is nonzero for all sufficiently small nonzero t. Thus, 


dim(V,,) = lim x" (tH) 


Cuts 
Cs 
P(u + ô) 
P ” 


where we have used the formula for c} in Lemma 10.22. Recalling the definition of 
P gives the dimension formula as stated in Theorem 10.18. o 


Example 10.23. If u; and u2 denote the two fundamental weights for sl(3; C) then 
the dimension of the representation with highest weight mı yı + muz is given by 


lm + DOn + DOn +m +2. 


See Exercises 7 and 8 for the analogous formulas for Bz and Gp. 


Proof. Note that scaling the inner product on b by a constant does not affect the 
right-hand side of the dimension formula, since the inner product occurs an equal 
number of times in the numerator and denominator. Let us then normalize the inner 
product so that all roots a satisfy (œ, œ) = 2. With this normalization, H, = a and 
we have 


mı = (u, Hı) = (ot, u) 
m = (u, H2) = (a2, u). 
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(a, + a2 + a3) = a3. We then note that 
2. Thus, the numerator in the dimension 


Letting a3 = a; + a2, we have 6 = 
(æi, ô) = 1, (a2,6) = 1, and (a3, ô) 


1 
2 


formula is 
((œ1, u) + (1, 8)) (Cara, p) + (%2, 8)) (3, u) + (%3, 8)) 
= (mı + 1)(m2 + 1)(m + m + 2) 
and the denominator is (1)(1)(2). o 


10.6 The Kostant Multiplicity Formula 


We will obtain Kostant’s multiplicity formula from the Weyl character formula by 
developing a method for dividing by the Weyl denominator q. We now illustrate this 
method in the case of sl(2; C), where the character formula takes the form 


eim+lja = e (nt Da 


Xm (aH) = . (10.21) 
ed —e-a 


One way to divide the numerator of (10.21) by the denominator is to use the 
geometric series 1/(1 — x) = 1+ x + x? +---. Applying this formally with 
x = e~ 74 gives the following result: 


1 1 


et —e-4 e“ (1 — e724) 


el fe ee 4 ese), (10.22) 


If the real part of a is negative, the series will not converge in the ordinary 
sense. Nevertheless, if we treat the right-hand side (10.22) as simply a formal series, 
then (10.22) holds in the sense that if we multiply the right-hand side by e° — e“, 
we get 1. Using (10.22), we have 


eimtDa _ e-(mt+la 


ef —e-4 
= rer iets 4. e724 qos -) _ ge re) 2p e724 eae -) 
— (e™ 4 elm—2)a Aes -) _ (eters 4 ent 4a Bees -) 
= e"! he elm—2a M ema 
This last expression is, indeed, the character of V,,. From this formula, we can read 
off that each weight of Vm has multiplicity 1 (Proposition 10.12). 


We now develop a similar method for dividing by the Weyl denominator in the 
general case. 
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Definition 10.24. A formal exponential series is a formal series of the form 
A 


where each À is an integral element and where the c,’s are complex numbers. 
A finite exponential series is a series of the same form where all but finitely many 
of the c,’s are equal to zero. 


Since we make no restrictions on the coefficients c}, a formal exponential series 
may not converge. Thus, the series should properly be thought of not as a function 
of H but simply as a list of coefficients c}. If f is a formal exponential series with 
coefficients c} and g is a finite exponential series with coefficients d}, the product 
of f and g is a well-defined formal exponential series with coefficients e} given by 


en => ewdy. (10.23) 
At 


Note that only finitely many of the terms in (10.23) are nonzero, since g is a finite 
exponential series. (The product of two formal exponential series is, in general, not 
defined, because the sum defining the coefficients in the product might be divergent.) 


Definition 10.25. If À is an integral element, we let p(A) denote the number of 
ways (possibly zero) that À can be expressed as a non-negative integer combination 
of positive roots. The function p is known as the Kostant partition function. 


More explicitly, if the positive roots are œ1,..., œp, then p(A) is the number of 
k-tuples of non-negative integers (n1, ..., ng) such that nya, +++» + nkæk = À. 


Example 10.26. If ot and a2 are the two positive simple roots for sl(3; C), then for 
any two non-negative integers mı and m2, we have 


P(m a, + m2) = 1 + min(m, m2). 


If A is not a non-negative integer combination of a; and a, then p(A) = 0. 
The result of Example 10.26 is shown graphically in Figure 10.6. 


Proof. We have the two positive simple roots œ; and a2, together with one other 
positive root œ = a, + a. Thus, if A can be expressed as a non-negative integer 
combination of œ; + a + @3, we can rewrite a3 as a, + a to express À as A = 
m1, + m2Q2, with mı, m2 > 0. Every expression for À is then of the form 


A = (mı —k)ay + (m2 — k)az + kas, 


for 0 < k < min(m, m2). o 


We are now ready to explain how to invert the Weyl denominator. 
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Fig. 10.6 The Kostant partition function p for A2. Unlabeled points A have p(A) = 0 


Proposition 10.27 (Formal Reciprocal of the Weyl Denominator). At the level 
of formal exponential series, we have 
1 — 


PEEN See, (10.24) 


n=0 
Here the sum is nominally over all integral elements n with n = 0, but p(n) = 0 
unless n is an integer combination of roots. 


The proposition means, more precisely, that the product of g(H) and the formal 
exponential series on the right-hand side of (10.24) is equal to 1 (i.e., to e°). To 
prove Proposition 10.24, we first rewrite g as a product. 


Lemma 10.28. The Weyl denominator may be computed as 
q(H) = || eae), (10.25) 
aeRt 
See Exercise 9 for the explicit form of this identity in the case g = sl(n + 1; C). 


Proof. Let q denote the product on the right-hand side of (10.25). If we expand out 
the product in the definition of g, there will be a term equal to 
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where 6 is half the sum of the positive roots. Note that even though a/2 is not 
necessarily integral, the element ô is integral (Proposition 8.38). We now claim that 
every other exponential in the expansion will be of the form te”) with A integral 
and strictly lower than 6. To see this, note that every time we take e~'*-”)/? instead 


of e(%-#)/2 we lower the exponent by a. Thus, any À appearing will be of the form 


4=8-) a, 


ace 


for some subset E of R*. Such a À is integral and lower than ô. 

Meanwhile, by precisely the same argument as in the proof of Point 1 of 
Proposition 10.21, the function g is alternating with respect to the action of W. 
Thus, if we write 


4H) = X ae (10.26) 
À 


then by Corollary 10.17, the coefficients a, must satisfy 


aw., = det(w)a,. (10.27) 
Since the exponential e”) occurs in the expansion (10.26) with a coefficient of 
1, e(”84) must occur with a coefficient of det(w~!) = det(w). Thus, it remains only 
to show that no other exponentials can occur in (10.26). To see this, note that if any 
exponential e#) occurs for which A is not in the W-orbit of ô , then by (10.27), 
another exponential eH) must appear with A’ dominant but strictly lower than 
6. Since 6 is the minimal strictly dominant integral element, 4’ cannot be strictly 
dominant (see Proposition 8.43) and thus must be orthogonal to one of the positive 
simple roots œj. Thus, Sq, +A’ = A’, where det(s«,) = —1. Applying (10.27) to this 
case shows that the coefficient of e-”) must be zero. oO 


Figure 10.7 illustrates the proof of Lemma 10.28 for the root system G2. The 
white dots indicate weights of exponentials that do not, in fact, occur in the 
expansion of g. Each of these white dots lies on the line orthogonal to some root, 
which means that the corresponding exponential cannot occur in the expansion of a 
Weyl-alternating function. 


Proof of Proposition 10.27. As in the sl(2; C) example considered at the beginning 
of this section, we have 


1 
e(%H)/2 = e—(a.H) 


z = e712] +4 eH) 4 e72la.H) 4 na) 


at the level of formal exponential series. Taking a product over a € Rt gives 


1 
i = oe 6-H) I] l 4 ee) 4 e@72aH) devas), 
q acR+t 
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Fig. 10.7 The black dots indicate the W -orbit of 6 for G2. The white dots indicate weights of 
exponentials that do not occur in the expansion of q 


In the product, a term of the form e77? will occur precisely as many times there 
are ways to express 7) as a non-negative integer combination of the œ’s, namely, 
p(n) times. Oo 


Theorem 10.29 (Kostant’s Multiplicty Formula). Suppose u is a dominant inte- 
gral element and V, is the finite-dimensional irreducible representation with highest 
weight u. Then if À is a weight of V,,, the multiplicity of À is given by 


mult(A) = $` sign(w) p(w- (u + 8) — (A +8). 


wEW 
Proof. By the Weyl character formula and Proposition 10.27, we have 
Xa (H) = | X pe 0t (x dexietvineot) 
n=0 wew 
For a fixed weight A, the coefficient of e7) in the character y, is just the 


multiplicity of A in V, (Proposition 10.12). This coefficient is the sum of the 
quantity 


p(n) det(w) (10.28) 
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over all pairs (n, w) for which 
—n—8+w-(u+ 6) =A, 
or 
n =w: (u +8)— (à + ô). (10.29) 


Substituting (10.29) into (10.28) and summing over W gives Kostant’s formula. O 


If u is well in the interior of the fundamental Weyl chamber and J is very close to 
u, then for all nontrivial elements of W, w - (u + ô) will fail to be higher than A + 6. 
In those cases, there is only one nonzero term in the formula and the multiplicity of 
A will be simply p(w —A). (By Exercise 7 in Chapter 9, p(u — A) is the multiplicity 
of A in the Verma module W,,.) In general, it suffices to compute the multiplicities 
of dominant weights À. For any dominant A, there will be many elements w of W 
for which w - (u + 6) will fail to be higher than A + 6. Nevertheless, in high-rank 
examples, the order of the Weyl group is very large and the number of nonzero terms 
in Kostant’s formula can be large, even if it is not as large as the order of W. 

Figure 10.8 carries out the multiplicity calculation for sl(3; C) in the irreducible 
representation with highest weight (2,9). (These multiplicities were presented 
without proof in Figure 6.5.) The calculation is done only for the weights in 
the fundamental Weyl chamber; all other multiplicities are determined by Weyl 
invariance. The term involving w € W makes a nonzero contribution to the 
multiplicity of A only if A + ô is lower than w - (u + ô), or equivalently if A is 
lower than w - (u + 6) — ô. For most weights À in the fundamental chamber, only 
the w = 1 term makes a nonzero contribution. In those cases, the multiplicity of A 
is simply p(w — À). 

Now, by Example 10.26 and Figure 10.6, p(jz — A) increases by 1 each time A 
moves from one “ring” of weights to the ring immediately inside. For the weights 
indicated by white dots, however, there are two nonzero terms, the second being the 
one in which w is the reflection about the vertical root a,. Since the determinant of 
the reflection is —1, the second term enters with a minus sign. On the medium-sized 
triangle, the first term is 4 and the second term is —1, while on the small triangle, 
the two terms are 5 and —2. Thus, in the end, all of the weights in all three of the 
triangles end up with a multiplicity of 3. 

It is not hard to see that the pattern in Figure 10.8 holds in general for 
representations of sl(3;C): As we move inward from one “ring” of weights to the 
next, the multiplicities increase by | at each step, until the rings become triangles, at 
which point the multiplicities become constant. (There is an increase in multiplicity 
from the last hexagon to the first triangle, but not from the first triangle to any 
of the subsequent triangles.) In particular, if the highest weight is on the edge of 
the fundamental Weyl chamber, all of the rings will be triangles and, thus, all of 
the weights have multiplicity 1. A small piece of this pattern of multiplicities was 
determined in a more elementary way in Exercises 11 and 12 in Chapter 6. 
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Fig. 10.8 Multiplicity calculation for the representation with highest weight (2, 9) 


For other low-rank Lie algebras, the pattern of multiplicities is more complicated, 
but can, in principle, be computed explicitly. In particular, the Kostant partition 
function can computed explicitly for Bz and G2, at which point one only needs 
to work out which terms in the multiplicity formula contribute, for weights in the 
fundamental chamber. (See, for example, [Tar] or [Cap] and compare [CT], which 
gives an elegant graphical method of computing the multiplicities in the By = C3 
case.) For Lie algebras of rank higher than two, [BBCV] gives efficient algorithms 
for computing the partition function for the classical Lie algebras, either numerically 
or symbolically. If the order of the Weyl group is not too large, one can then 
implement Kostant’s formula to read off the multiplicities. 

For higher-rank cases, the order of the Weyl group can be very large, in which 
case it is not feasible to use Kostant’s formula, even if the partition function is 
known. Freudenthal’s formula (Sect. 22.3 in [Hum]) gives an alternative method 
of computing the multiplicities, which is preferable in these cases, because it does 
not involve a sum over the Wey] group. 
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10.7 The Character Formula for Verma Modules 


Before coming to the proof of the Wey] character formula, we consider a “warm up 
case,” that of the character of a Verma module. In the character formula for Verma 
modules, there is an even larger cancellation than in the Weyl character formula. The 
product of the Weyl denominator and the character would appear to be an infinite 
sum of exponentials, and yet all but one of these exponentials cancels out! The proof 
of this character formula follows a very natural course, consisting of writing down 
explicitly the multiplicities for the various weight spaces and then using the formula 
for the multiplicities to establish the desired cancellation. The character formula for 
Verma modules is a key ingredient in the proof of the character formula for finite- 
dimensional representations. 

Since a Verma module (z, V,,) is infinite dimensional, the operator e” may 
not have a well-defined trace. On the other hand, Point 2 of Proposition 10.12 gives 
an expression for the character of a finite-dimensional representation, evaluated at 
a point in 6, in terms of the weights for the representation. This observation allows 
us to define the character of any representation that decomposes as a direct sum of 
weight spaces, as a formal exponential series on b. (Recall Definition 10.24.) 


Definition 10.30. Let V be any representation (possibly infinite dimensional) that 
decomposes as a direct sum of integral weight spaces of finite multiplicity. We define 
the formal character of V by the formula 


Oy(H) =) mitet, Heb, 
À 


where the Qy is interpreted as a formal exponential series. 


Proposition 10.31 (Character Formula for Verma Modules). For any integral 
element À, the formal character of the Verma module is given by 


Om (H) = >> pine", (10.30) 


n=0 


where p is the Kostant partition function in Definition 10.25. This formal character 
may also be expressed as 


Qm (H) = e0 t5) (zm) (10.31) 


where 1/q(H) is the formal reciprocal of the Weyl denominator given in 
Proposition 10.27. 


The formula (10.31) is similar to the Weyl character formula, except that in 
the numerator we have only a single exponential rather than an alternating sum 
of exponentials. Of course, (10.31) implies that 


q(H)Q m (H) =e? t8). (10.32) 
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This result means that when we multiply the Weyl denominator (an alternating finite 
sum of exponentials) and the formal character Q w, (an infinite sum of exponentials), 
the result is just a single exponential, with all the other terms cancelling out. As 
usual, it is easy to see this cancellation explicitly in the case of sl(m; C). There, if 
m is any integer (possibly negative) and H is the usual diagonal basis element, we 
have 


Ow,,(aH) = ea 4 elm —2)a 4 elma Aid, 
Since q(aH) = e +e“, we obtain 


q(aH) Ow, (aH) = e™*04 4 elm D4 4 emda 4, 


_ efm—Da _ e(m—3)a 


= emt Da. 


Proof. Enumerate the positive roots as a,..., œx and let {Y;} be nonzero elements 
of Daj» j =1,...,k. By Theorem 9.13, the elements of the form 


m(Y1)"! +++ a (Yp)”™* vo (10.33) 


form a basis for the Verma module. An element of the form (10.33) is a weight 
vector with weight 


E = À — Nn —---— Nga. 


The number of times the weight € will occur is the number of ways that A — £ can 
be written as a non-negative integer combinations of the positive roots. Thus, we 
obtain the first expression for Q w,. The second expression the follows easily from 
the first expression and the formula (Proposition 10.27) for the formal inverse of the 
Weyl denominator. [Compare (10.30)-(10.24).] oO 


10.8 Proof of the Character Formula 


Our strategy in proving the character formula is as follows. We will show first 
that the (formal) character of a Verma module W, can be expressed as a finite 
linear combination of characters of irreducible highest weight cyclic representations 
V,. (The V,,’s may be infinite dimensional.) By using the action of the Casimir 
element, we will see that only the only 7’s appearing in this decomposition are 
those satisfying |n + p|? = |u + pl’. We will then invert this relationship and 
express the character of an irreducible representation V, as a linear combination of 
characters of Verma modules W,,, with n again satisfying |7 + p? = |u + pl’. We 
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will then specialize to the case in which u is dominant integral, where V, is finite 
dimensional and its character y” is a finite sum of exponentials. By the character 
formula for Verma modules from Sect. 10.7, we obtain the following conclusion: 
The product g(H)y"(H) is a finite linear combination of exponentials e/-7), 
where each A = n + p satisfies |A|? = |u + p|? . From this point, it is a short 
step to show that only A’s of the form A = w- (u + p) occur and then to prove the 
character formula. 

We know from general principles that the product g(H)y" (#7) is a finite sum of 
exponentials. We need to show that the only exponential that actually occur in this 
product (with a nonzero coefficient) are those of the form ew (u+8).H) | and that such 
exponentials occur with a coefficient of det(w). We begin with a simple observation 


that limits which exponentials could possibly occur in the product. 
Proposition 10.32. If an exponential e") occurs with nonzero coefficient in the 
product q(H)x"(#), then à must be in the convex hull of the W -orbit of ju + 6 and 


A must differ from u + ô by an element of the root lattice. 


In the last image in Figure 10.4, the white and black circles indicate weights 
consistent with Proposition 10.32. Only a small fraction of these weights (the black 
circles) actually occur in g(H)y" (#2). 


Proof. If À is an integral element, then for each root a, 


(A, a) 
(a, æ) 


Sy:A=A-2 a 


will differ from A by an integer multiple of œ. Since roots are integral, we conclude 
that sy-A is, again, integral. It follows that for any w € W, the element w-d is integral 
and differs from A by an integer combination of roots. Thus, by Proposition 8.38, 
w- 6 is integral and differs from ô by an integer combination of roots. Similarly, 
each weight of x is integral and differs from jz by an integer combination of 
roots (Theorem 10.1). Thus, for each exponential e@-”) occurring in the product 
q(H1)x" (#1), the element A will be integral and will differ from ô + u by an integer 
combination of roots. 

Meanwhile, since each exponential in q is in the Weyl-orbit of ô and each 
exponential in x“ is in the convex hull of the Weyl-orbit of u (Theorem 10.1), each 
exponential in the product will be in the convex hull of the Weyl-orbit of u + ô, as 
claimed. o 


Our next result is the key to the proof of the character formula. In fact, we will 
see that it, in conjunction with Proposition 10.32, limits the exponentials that can 
occur in q(H)x”(H) to only those whose weights are in the Weyl orbit of u + 6. 


Proposition 10.33. If an exponential e*®Ħ) occurs with nonzero coefficient in the 
product q(H)x"(h), then à must satisfy 


(A,A) = (u +8, u + ô). (10.34) 
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Fig. 10.9 The weights indicated by white dots satisfy the condition in Proposition 10.33 but not 
the condition in Proposition 10.32. The weights indicated by black dots satisfy both conditions 


In the last image in Figure 10.4, for example, the exponentials A represented 
by the white dots do not satisfy (10.34) and, thus, cannot occur in the product 
q(H)x"(h). Much of the rest of this section will be occupied with the proof of 
Proposition 10.33. Before turning to this task, however, let us see how Proposi- 
tions 10.32 and 10.33 together imply the character formula. The claim is that the 
only weights satisfying the conditions in both of these propositions are the ones in 
the W-orbit of u + ô. See Figure 10.9. 


Proof of Character Formula Assuming Proposition 10.33. Suppose v1,...,Um are 
distinct elements of a real inner product space, all of which have the same norm, S. 
The convex hull of these elements is the space of vectors of the form 


m 
aj Vj 
j=l 


with 0 < a; < 1 and D ja = 1. It is then not hard to show by induction 
on m that the only way such a convex combination can have norm S is if one 
of the a;’s is equal to 1 and all the others are zero. Applying this with the v;’s 
equal to w - (u + 5),w € W, shows that the only exponentials e’-”) satisfying 
both Propositions 10.32 and 10.33 are those of the form A = w- (u + ô). 
Thus, the only exponentials appearing in the product gy" are those of the form 
cele (H+8) A) wew. 

Now, the exponential e(“+%”) occurs in the product exactly once, since it 
corresponds to taking the highest weight from q (i.e., 6) and the highest weight 
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from x“ (i.e., u) and the coefficient of e(4+5.H) is 1, (Note that since ô is strictly 
dominant, w- ô is strictly lower than ô for all nontrivial w € W, by Proposition 8.42.) 
Since the character is Wey] invariant and the Weyl denominator is Wey] alternating, 
their product is Weyl alternating. Thus, by Corollary 10.17, the coefficients in the 
expansion of y“q are alternating; that is, the coefficient of e¢”“+9-") must equal 
det(w). o 


We now begin the process of proving Proposition 10.33. A crucial role in the 
proof is played by the Casimir element, which we have already introduced in 
Sect. 10.2. (See Definition 10.4.) We now extend the result of Proposition 10.6 
to highest-weight cyclic representations that may not be irreducible or finite 
dimensional. 


Proposition 10.34. Let (x,V) be a highest-weight cyclic representation of g 
(possibly infinite dimensional) with highest weight À and let mt be the extension 
of a to U(g). Then (C) = cy I, where 


ca = (A se A + 8) — (8,8) . 


Proposition 10.34 applies, in particular, to the Verma module W, and to the 
irreducible representation V}. A key consequence of the proposition is that if 
two highest weight cyclic representations, with highest weights 4, and A, have 
the same eigenvalue of the Casimir, then (A; + 6,4; + ô) must coincide with 
(Az + 6,A2 + 6). 


Proof. The same argument as in the proof of Proposition 10.6 shows if ug is the 
highest weight vector for V, then 7(C)vp = cvo. Now let U be the space of all 
v € V for which z(C)v = cyv. Since C is in the center of U(g) (Proposition 10.5), 
we see that if v € U, then 


u(C)(a(X)v) = 2(X)a(C)v = cg a(X)v, 


showing that z(X)v is again in U. Thus, U is an invariant subspace of V containing 
vo, which means that U = V, that is, that 7(C)v = c,v for all v € V. o 


Definition 10.35. Let A denote the set of integral elements. If u is a dominant 
integral element, define a set S, C A as follows: 


Sa = {n E€ Al (n +8,n +8) = (u +8, u + 4)}. 


Note that S, is the intersection of the integral lattice A with sphere of radius 
|æ + ô| centered at —d; see Figure 10.10. Since there are only finitely many 
elements of A in any bounded region, the set S, is finite, for any fixed u. Our 
strategy for the proof of Proposition 10.33 is as discussed at the beginning of this 
section. We will decompose the formal character of each Verma module W, with 
n € S, as a finite sum of formal characters of irreducible representations V,,, 
with each y also belonging to S,,. This expansion turns out to be of an “upper 
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Fig. 10.10 The set S, (black dots) consists of integral elements À for which ||A + 6]] = llu + ŝll 


triangular with ones on the diagonal” form, allowing us to invert the expansion 
to express the formal character of irreducible representations in terms of formal 
characters of Verma modules. In particular, the character of the finite-dimensional 
representation V,, will be expressed as a linear combination of formal characters of 
Verma modules W, with n € S,. When we multiply both sides of this formula 
by the Weyl denominator and use the character formula for the Verma module 
(Proposition 10.31), we obtain the claimed form for q x“. 

Of course, a key point in the argument it verify that whenever the character 
of an irreducible representation V, appears in the expansion of the character 
of W,,n € Su, the highest weight y is also in S,. This claim holds because 
any subrepresentation V, occurring in the decomposition of W, must have the 
same eigenvalue of the Casimir as W}. In light of Proposition 10.34, this means 
that (y + 6,y + ô) must equal (n + 6,7 + ô), which is assumed to be equal to 
(u +8, u +ô). 


Proposition 10.36. For each n in S„, the formal character of the Verma module 
W, can be expressed as a linear combination of formal characters of irreducible 
representations V, with y in S, and y < n: 


Ow, = >) alOv, (10.35) 
yeSu 
yxn 


Furthermore, the coefficient ay of Qy, in this decomposition is equal to 1. 


See Figure 10.11. 
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Fig. 10.11 For 7 € S,,, the character of W, is a linear combination of characters of irreducible 
representations V, with highest weights y < 7 in S, 


Lemma 10.37. Let (x, V) be a representation of g, possibly infinite dimensional, 
that decomposes as direct sum of weight spaces of finite multiplicity, and let U be a 
nonzero invariant subspace of V. Then both U and the quotient representation V /U 
decompose as a direct sum of weight spaces. Furthermore, the multiplicity of any 
weight in V is the sum of its multiplicity in U and its multiplicity in V/U. 


Proof. Let u be an element of U. By assumption, we can decompose u as u = 
vı +:+:+0;, where the vg’s belong to weight spaces in V corresponding to distinct 
weights À1,..., Aj. We wish to show that each vg actually belongs to U. If j = 1, 
there is nothing to prove. If j > 1, then A; # A,, which means that there is 
some H e€ § for which (Aj ; H) # (A,, H). Then apply to u the operator r (H) — 
(Ai, H) I 


J 
(1H) — (41, H) Du = X (Ak, H) = (An, He. (10.36) 
k=1 


Since the coefficient of vı is zero, the vector in (10.36) is the sum of fewer than j 
weight vectors. Thus, by induction on j, we can assume that each term on the right- 
hand side of (10.36) belongs to U. In particular, a nonzero multiple of v; belongs 
to U, which means v; itself belongs to U. Now, if v; is in U, then u — vj = vı + 
-+++ vj- is also in U. Thus, using induction again, we see that each of v1, ..., Uj—1 
belongs to U. 

We conclude that the sum of the weight spaces in U is all of U. Since weight 
vectors with distinct weights are linearly independent (Proposition A.17), the sum 
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must be direct. We turn, then, to the quotient space V/ U. It is evident that the images 
of the weight spaces in V are weight spaces in V/U with the same weight. Thus, 
the sum of the weight spaces in V/U is all of V/U and, again, the sum must be 
direct. 

Finally, consider a fixed weight À occurring in V, and let V} be the associated 
weight space. Let q, be the restriction to V, of the quotient map q : V > V/U.The 
kernel of q} consists precisely of the weight vectors with weight A in U. Thus, the 
dimension of the image of q1, which is the weight space in V/U with weight À, is 
equal to dim V} — dim(V, N U). The claim about multiplicities in V, U, and V/U 
follows. o 


Proof of Proposition 10.36. We actually prove a stronger result, that the formal 
character of any highest-weight cyclic representation U, with n € S,, can be decom- 
posed as in (10.35). As the proof of Proposition 6.11, any such U, decomposes as a 
direct sum of weight spaces with weights lower than 7 and with the multiplicity of 
the weight 7 being 1. For any such Uj, let 


M= > mult(y). 


yeESy 


Our proof will be by induction on M. 

We first argue that if M = 1, then U, must be irreducible. If not, U, would 
have a nontrivial invariant subspace X, and this subspace would, by Lemma 10.37, 
decompose as a direct sum of weight spaces, all of which are lower than 7. Thus, X 
would have to contain a weight vector w that is annihilated by each raising operator 
m(Xjq),a@ € Rt. Thus, X would contain a highest weight cyclic subspace X’ with 
some highest weight y. By Proposition 10.34, the Casimir would act as the scalar 
(y + 6, y + 4) — (6,6) in X’. On the other hand, since X’ is contained in U,, the 
Casimir has to act as (7 + 6,7 + ô) — (5,5) in X’, which, since n € S,, is equal 
to (u + ô, u + 6) — (5,5). Thus, y must belong to S,,. Meanwhile, y cannot equal 
n or else X’ would be all of U,. Thus, both 7 and y would have to have positive 
multiplicities and M would have to be at least 2. 

Thus, when M = 1, the representation U, is irreducible, in which case, 
Proposition 10.36 holds trivially. Assume now that the proposition holds for highest 
weight cyclic representations with M < Mọ, and consider a representation U, with 
M = Mọo+ 1. If U, is irreducible, there is nothing to prove. If not, then as we argued 
in the previous paragraph, U„ must contain a nontrivial invariant subspace X’ that is 
highest weight cyclic with some highest weight y that belongs to S, and is strictly 
lower than 7. We can then form the quotient vector space U,/X’, which will still be 
highest weight cyclic with highest weight n. By Lemma 10.37, the multiplicity of & 
in U, is the sum of the multiplicities of € in X’ and in U,,/X'. Thus, 


Qu, = Qx F Qu,/x’- 
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Now, both X’ and U,/X’ contain at least one weight in S, with nonzero 
multiplicity, namely y for X’ and ņ for U,,/X’. Thus, both of these spaces must have 
M < Mo and we may assume, by induction, that their formal characters decompose 
as a sum of characters of irreducible representations with highest weights in S,. 
These highest weights will be lower than u, in fact lower than y in the case of X’. 
Thus, the character of V, will not occur in the expansion of Q x . But since U,/ X’ 
still has highest weight 7, we may assume by induction that the character of V,, 
will occur exactly once in the expansion of Qu, /x’, and, thus, exactly once in the 
expansion of Qy,. o 


Proposition 10.38. If we enumerate the elements of S, in nondecreasing order as 
Su = {m, -M }, then the matrix Aj, := ani is upper triangular with ones on the 
diagonal. Thus, A is invertible, and the inverse of A is also upper triangular with 
ones on the diagonal. It follows that we can invert the decomposition in (10.35) to 
a decomposition of the form 


Ov, = >. br Ow, (10.37) 


yESu 


where b; =l; 


It is easy to check (using, say, the formula for the inverse of a matrix in terms of 
cofactors) that the inverse of an upper triangular matrix with ones on the diagonal is 
again upper triangular with ones on the diagonal. 


Proof. For any finite partially ordered set, it is possible (Exercise 6) to enumerate 
the elements in nondecreasing order. In our case, this means that we can enumerate 
the elements of S,, as 1,..., nı in such a way that if n; < ng then j < k. If we 
expand Ow, in terms of Ov, as in Proposition 10.38, the only nonzero coefficients 
are those with n; < nx, which means that j < k. Thus, the matrix is upper 
triangular. Since, also, the coefficient of Qva in the expansion of Ow, is 1, the 
expansion has ones on the diagonal. o 


Proof of Proposition 10.33. We apply (10.37) with n = u, so that Qy, is the char- 
acter of y” (H) of the finite-dimensional, irreducible representation with highest 
weight u. We then multiply both sides of (10.37) by the Weyl denominator q. Using 
the character formula for Verma modules [Proposition 10.31 and Eq. (10.32)], we 
obtain 


q(H) X“ (H) = >) bbe, (10.38) 


yESu 


Since each y belongs to S„, each weight A := y + 6 occurring on the right-hand 
side of (10.38) satisfies 


(A,A) = (y +6,y + ô) = (u+ ô, u + ô). 
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Thus, we have expressed g(H)7“(#) as a linear combination of exponentials with 
weights À satisfying Proposition 10.33. Since any such decomposition is unique 
by Proposition 10.16, it must be the one obtained by multiplying together the 
exponentials in q and 7”. oO 


10.9 Exercises 


1. Suppose g is a complex Lie algebra that is reductive (Definition 7.1) but not 
semisimple. Show that there exists a finite-dimensional representation of g that 
is not completely reducible. (Compare Theorem 10.9 in the semisimple case.) 

2. Suppose g is a complex semisimple Lie algebra with the property that every 
finite-dimensional representation of g is completely reducible. Show that g 
decomposes as a direct sum of simple algebras. 

Hint: Consider the adjoint representation of g. 

3. Using Proposition 10.16, show that the Weyl denominator is a Weyl-alternating 
function. 

4. Suppose that f : hb — C is a polynomial and that f(H) = 0 whenever 
(a, H) = 0. Show that f is divisible in the space of polynomial functions 
by (a, H). 

Hint: Choose coordinates z1, .. . , zr on h for which (a, H} = z1. 

5. Suppose f is an analytic function on h, meaning that f can be expressed in 
coordinates in a globally convergent power series. Collect together all the terms 
in this power series that are homogeneous of degree k, so that f is the sum of 
homogeneous polynomials px of degree k. Show that if f is alternating with 
respect to the action of W, so is each of the polynomials px. 

Hint: Show that the composition of a homogeneous polynomial with a linear 
transformation is again a homogeneous polynomial. 

6. Show that any finite partially ordered set £ can be enumerated in nondecreasing 
order. 

Hint: Every finite partially ordered set has a minimal element, that is, an element 
x € E such that no y Æ x in E is smaller than x. 

7. Let {a 1, a2} be a base for the B root system, with a, being the shorter root, as 
in the left-hand side of Figure 8.7. Let jz; and u2 be the associated fundamental 
weights (Definition 8.36), so that every dominant integral element is uniquely 
expressible as 


u = Mı tı + Mp2, 


with mı and mz being non-negative integers. Show that the dimension of the 
irreducible representation with highest weight ju is 


1 
gm + 1)(m + 1)(m, + m + 2)(m, + 2m? + 3). 


Hint: Imitate the calculations in Example 10.23. 
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10 Further Properties of the Representations 


Let the notation be as in Exercise 7, but with B2 replaced by G2. (See 
Figures 8.6 and 8.11.) Show that the dimension of the irreducible representation 
with highest weight ju is 


1 
g + 1)(m2 + 1)(m, + m2 + 2) 
x (mı + 2m + 3)(m, + 3m + 4)(2m, + 3m + 5). 


Show that the smallest nontrivial irreducible representation of Gz has dimen- 
sion 7. 


. According to Lemma 10.28, we have the identity 


> det(w)e! 8H) = I] (eet? = een), (10.39) 
wew aeRt 


Work out the explicit form of this identity for the case of the Lie algebra 
sl(n + 1;C), using the Cartan subalgebra h described in Sect. 7.7.1 and the 
system of positive roots described in Sect. 8.10.1. If a typical element of 5 
has the form (do,...,@,), introduce the variables z; = e^. Show that after 
multiplying both sides by (zo ---z,)"/”, the identity (10.39) takes the form of a 
Vandermonde determinant: 


zh gi} we] 
z gml s | 
det (FHE. 
. . . j<k 
ga el 


Let x be an irreducible, finite-dimensional representation of g with highest 
weight u, and let 2* be the dual representation to z. 


(a) Show that the weights of z* are the negative of the weights of z. 

(b) Let wo be the unique element of W that maps the fundamental Weyl 
chamber C to —C. Show that the highest weight * of 2* may be 
computed as 


* 


H =5-wo:h. 


(Compare Exercises 2 and 3 in Chapter 6.) 
(c) Show that if —J is an element of the Weyl group, then every representation 
of g is isomorphic to its dual. 


Part III 
Compact Lie Groups 


Chapter 11 
Compact Lie Groups and Maximal Tori 


In this chapter and Chapter 12 we develop the representation theory of a connected, 
compact matrix Lie group K. The main result is a “theorem of the highest weight,” 
which is very similar to our main results for semisimple Lie algebras. If we let € be 
the Lie algebra of K and we let g be the complexification of €, then g is reductive, 
which means (Proposition 7.6) that g is the direct sum of a semisimple algebra 
and a commutative algebra. We can, therefore, draw on our structure results for 
semisimple Lie algebras to introduce the notions of roots, weights, and the Weyl 
group. We will, however, give a completely different proof of the theorem of the 
highest weight. In particular, our proof of the hard part of the theorem, the existence 
of a irreducible representation for each weight of the appropriate sort, will be based 
on decomposing the space of functions on K under the left and right action of 
K. This argument is independent of the Lie-algebraic construction using Verma 
modules. 

In the present chapter, we develop the structures needed to formulate a theorem 
of the highest weight for K, and we develop some key tools that will aid is in the 
proof of the theorem. The representations themselves will appear in the next chapter. 
The key results of this chapter are the torus theorem and the Wey] integral formula. 
Although parts of the chapter assume familiarity with the theory of manifolds and 
differential forms, the reader who is not familiar with that theory can still follow the 
statements of the key results. Furthermore, the torus theorem can easily be proved 
by hand in the case of SU(7). The reader who is willing to take the results of this 
chapter on faith can proceed on to Chapter 12, where they are applied to prove 
the compact-group versions of the Weyl character formula and the theorem of the 
highest weight. Finally, in Chapter 13, we will take a close look at the fundamental 
group of K. We will prove, among other things, that when K is simply connected, 
the notion of “dominant integral element” for K coincides with the analogous notion 


for the Lie algebra g. 

Throughout the chapter, we assume that K is a connected, compact matrix Lie 
group with Lie algebra €. We allow £, and thus also g := c, to have a nontrivial 
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center, which means that g is reductive but not necessarily semisimple. As in 
Proposition 7.4, we fix on g an inner product that is real on € and invariant under the 
adjoint action of K. 


11.1 Tori 


In this section, we consider tori, that is, groups isomorphic to a product of copies of 
S'. In the rest of the chapter, we will be interested in tori that arise as subgroups of 
connected compact groups. 


Definition 11.1. A matrix Lie group T is a torus if T is isomorphic to the direct 
product of k copies of the group S! = U(1), for some k. 


Consider, for example, the group T of diagonal n x n matrices with determinant 
1. Every element of T can be expressed uniquely as 


diag(uy, vee, Un—-1, (u AES Un—1)') 


for some complex numbers u1,..., %,—1 With absolute value 1. Thus, T is isomor- 
phic to n — 1 copies of S!. 


Theorem 11.2. Every compact, connected, commutative matrix Lie group is a 
torus. 


Recall that a subset E£ of a topological space is discrete if every element e of E 
has a neighborhood containing no points of E other than e. 


Lemma 11.3. Let V be a finite-dimensional inner product space over R, viewed as 
a group under vector addition, and let I be a discrete subgroup of V. Then there 
exist linearly independent vectors v1,..., vg in V such that T is precisely the set of 
vectors of the form 


MV] +++ + MkUk, 


with each mj € Z. 


Proof. Since I is discrete, there is some ¢ > 0 such that the only point y in I with 
lyi < eis y = 0. For any y, y’ € T, if ||y’ — y|| < £, then since y’ — y is also in 
T, we must have y’ = y. It then follows easily that there can be only finitely many 
points in T in any bounded region of V. If T = {0}, the result holds with k = 0. 
Otherwise, we can find some nonzero yo € I with such that || yo|| is minimal among 
nonzero elements of I. Let W denote the orthogonal complement of the span of yo, 
let P denote the orthogonal projection of V onto W, and let I’ denote the image of 
T under P. Since P is linear, I’ will be a subgroup of W. We now claim that T” is 
discrete in W. 
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Fig. 11.1 If y is very close to the line through yo, then ||y — nyol| is smaller than || yo|| for some n 


Suppose, toward a contradiction that T” is not discrete. Then for every € > 0, 
there must exist 6 4 6’ € I’ with ||’ — 6|| < £. Thus, 6’ — 6 is a nonzero element 
of T” with norm less than £. There then exists some y € T with P(y) = 6’—6. 
Since ||6’ — || < £, the distance from y to the span of yo is less than £. Now let £ 
be the orthogonal projection of y onto the span of yo, which is the point closest to 
y in line through yo. Then £ lies between myo and (m + 1)yo for some integer m. 
By taking n = m orn = m + 1, we can assume that the distance between nyo and 
Ê is at most half the length of yo. Meanwhile, the distance from £ to y is at most £. 
Thus, the distance from nyo to y is at most € + || yo|| /2, which is less than || yo|| , if 
€ is small enough. But then y — nyo is a nonzero element of I with norm less than 
the norm of yo, contradicting the minimality of yo. (See Figure 11.1.) 

Now that T” is known to be discrete, we can apply induction on the dimension 
of V. Thus, there exist linearly independent vectors u1, ...,ug—1 in I” such that 
T” is precisely the set of integer linear combinations of u1, ...,ug—1. Let us then 
choose vectors v,,..., Vk—1 in I such that P(v;) = uj. Since P(v1),..., P(vg—1) 
are linearly independent in W, it is easy to see that vj,..., v~—1 and yo are linearly 
independent in V. For any y € T, the element P(y) is of the form miui +--+ + 
My—1Uz—1. Thus, y must be equal to miv + +++ + Mk—1Vk-1 + 0, where o e T 
satisfies P(o) = 0, meaning that o is a multiple of yọ. But then o must be an integer 
multiple of yo, or else, by an argument similar to the one in the previous paragraph, 
there would be an element of T in the span of yo with norm less than ||yo|| . We 
conclude, then that 


yY = MV +++ + ME-1VK-1 + MEY; 
establishing the desired form of y. o 


Proof of Theorem 11.2. Let T be compact, connected, and compact, and let t be 
the Lie algebra of T. Then t is also commutative (Proposition 3.22), in which 
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case Corollary 3.47 tells us that the exponential map exp : t —> T is surjective. 
Furthermore, the exponential map for T is a Lie group homomorphism and its 
kernel I’ must be discrete, since the exponential map is injective in a neighborhood 
of the origin. Thus, by Lemma 11.3, I is the set of integer linear combinations 
of independent vectors v;,..., vg. If dimt = n, then exp descends to a bijective 
homomorphism of t/ T = (S!)*xIR"~* with T. Now, the Lie algebra map associated 
to this homomorphism is invertible (since the Lie algebra of t is t), which means that 
the inverse map is also continuous. Thus, T is homeomorphic to (S!)‘ xR"~*. Since 
T is compact, this can only happen if k = n, in which case, T is the torus (S!)". 0 


At various points in the developments in later chapters, it will be useful to 
consider elements ¢ in a torus T for which the subgroup generated by ¢ is dense 
in T. The following result guarantees the existence of such elements. 


Proposition 11.4. Let T = (S')* and let t = (e°"',...,e?7) be an element of 
T. Then t generates a dense subgroup of T if and only if the numbers 


1,0,,...,%& 


are linearly independent over the field Q of rational numbers. 


The k = 1 case of this result is Exercise 9 in Chapter 1. In particular, if x is 
any transcendental real number and we define 0 j= xi, j = 1,...,k, then t will 
generate a dense subgroup of T. See Figure 11.2 for an example of an element that 
generates a dense subgroup of S! x S!. 


Lemma 11.5. [fT is a torus and t is an element of T, then the subgroup generated 
by t is not dense in T if and only if there exists a nonconstant homomorphism © : 
T — S! such that ®(t) = 1. 


Proof. Suppose first that there exists a nonconstant homomorphism ® : T —> S! 
with T(t) = 1. Then ker(®) is a closed subgroup of T that contains t and thus the 
group generated by t. But since ® is nonconstant, ker(®) # T, which means that 
the closure of the group generated by ¢ is not all of T. 

In the other direction, let S be the closure of the group generated by t and suppose 
that S is not all of 7. We will proceed by describing the preimage of S under the 
exponential map, using an extension of Lemma 11.3. Thus, let A be the set of all 
H € tsuch that e?" e€ S. Since S is a closed subgroup of T, the set A will be a 
closed subgroup of the additive group t. Now let Ao be the identity component of 
A, which must be a subspace of t. (Indeed, by Corollaries 3.47 and 3.52, Ag must 
be equal to the Lie algebra of A.) Since S is not all of T, the subspace Ag cannot be 
all of t. 

The entire group A now decomposes as Ag x A1, where 


Ay := AN (Ao)+ 
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Fig. 11.2 A portion of the dense subgroup generated by ¢ in S! x S! 


is a closed subgroup of (Ag)+. Furthermore, the identity component of A; must be 
trivial, which means that the Lie algebra of A; must be {0}. Thus, by Theorem 3.42, 
A, is discrete. Let us now define a homomorphism ¢ : t > S! by setting 


(H) = e2tié(H) 


for some linear functional € on t. By Lemma 11.3, A; is the integer span of linearly 
independent vectors v,,...,v;. Since Ag Æ t, we can arrange things so that & is 
zero on Ao, the values of £ on v),..., v; are integers, but £ is not identically zero. 
Then ker(@) will contain A, and, in particular, the kernel of the map H => eH, 
Thus, there is a nonconstant, continuous homomorphism ® : T —> S! satisfying 


P(e™") = $(H) 


for all H e€ t. If we choose H so that e27” = t €e S, then H € A, which means 
that 


(t) = e7") = 1, 


but © is not constant. oO 
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Proof of Proposition 11.4. In light of Lemma 11.5, we may reformulate the propo- 
sition as follows: The numbers 1, 6),...,6, are linearly dependent over Q if 
and only if there exists a nonconstant homomorphism ® : T — S! with 
(e?ii... eik) € ker(®). Suppose first that there is a dependence relation 
among 1, 6,..., Og over Q. Then after clearing the denominators from this relation, 
we find that there exist integers ,...,™,, not all zero, such that 


miĝi +++» +m € Z. 
Thus, we may define a nonconstant ® : T —> S! by 
D(uy,..., ue) = uy ++ uy (11.1) 
and the kernel of ® will contain t. 
In the other direction, Exercise 2 tells us that every continuous homomorphism 


®:T —> S! is of the form (11.1) for some set of integers mı, .. . , mp. Furthermore, 
if ® is nonconstant, these integers cannot all be zero. Thus, if 


l= @(e271% 271%) = e?ii A FF Hk) 
we must have m,0; ++- + m,0, = n for some integer n, which implies that 
1,6,,..., Ok are linearly dependent over Q. oO 


11.2 Maximal Tori and the Weyl Group 


In this section, we introduce a the concept of a maximal torus, which plays the same 
role in the compact group approach to representation theory as the Cartan subalgebra 
plays in the Lie algebra approach. 


Definition 11.6. A subgroup T of K is a torus if T is isomorphic to (S!)* for 
some k. A subgroup T of K is a maximal torus if it is a torus and is not properly 
contained in any other torus in K. 


If K = SU(n), we may consider 
eff 
T= oe 6; ER j 


ei bn- 
eiO t+ 0—1) 


which is a torus of dimension n — 1. If T is contained in another torus S C SU(n), 
then every element s of S would commute with every element t of T. If we choose 
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t to have distinct eigenvalues, then by Proposition A.2, s would have to be diagonal 
in the standard basis, meaning that s € T. Thus, T is actually a maximal torus. 


Proposition 11.7. If T is a maximal torus, the Lie algebra t of T is a maximal 
commutative subalgebra of £. Conversely, if t is a maximal commutative subalgebra 
of €, the connected Lie subgroup T of € with Lie algebra t is a maximal torus. 


Proof. If T is a maximal torus, it is commutative, which means that its Lie algebra t 
is also commutative (Proposition 3.22). Suppose t is contained in a commutative 
subalgebra s$. Then it is also contained in a maximal commutative subalgebra 
s’ containing s. The connected Lie subgroup S’ with Lie algebra s’ must be 
commutative (since S” is generated by exponentials of elements of s’) and closed 
(Proposition 5.24) and hence compact. Thus, by Theorem 11.2, S” is a torus. Since 
T is a maximal torus, we must have S’ = T and thus s’ = 5 = t, showing that t is 
maximal commutative. 

In the other direction, if t is maximal commutative, the connected Lie subgroup 
T with Lie algebra t is closed (Proposition 5.24), hence compact. But T is also 
commutative and connected, hence a torus, by Theorem 11.2. If T is contained in a 
torus S, then tis contained in the commutative Lie algebra s of S. Since tis maximal 
commutative, we have s = t and since S is connected, S = T, showing that T is a 
maximal torus. o 


Definition 11.8. If T is a maximal torus in K, then the normalizer of T, denoted 
N(T), is the group of elements x € K such that x7x~'! = T. The quotient group 


W := N(T)/T 


is the Weyl group of 7. 


Note that T is, almost by definition, a normal subgroup of N(T). If w is an 
element of W represented by x € N(T), then w acts on T by the formula 


w-t=axtr'!, heT. 


If x € N(T), the conjugation action of x maps T onto T. It follows that Ad, maps 
the Lie algebra t of T into itself. We define an action of W on t by 


w-H=Ad,(A), Het. (11.2) 


Since our inner product is invariant under the adjoint action of K, the action of W 
on t is by orthogonal linear transformations. 

We will see in Sect. 11.7 that the centralizer of 7—that is, the group of those 
x € K such that xtx™! = t for all t € T—is equal to T. It follows that W acts 
effectively on T, meaning that if w -t = t for allt € T, then w is the identity 
element of W. It then follows from Corollary 3.49 that W also acts effectively on 
t. Thus, W may be identified with the group of orthogonal linear transformations 
of t of the form H + w- H. We will also show in Sect. 11.7 that this group 
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of linear transformations coincides with the group generated by the reflections 
through the hyperplanes orthogonal to the roots. Thus, the Weyl group as defined 
in Definition 11.8 is naturally isomorphic to the Weyl group associated to the Lie 
algebra g := c in Sect. 7.4. Exercise 3, meanwhile, asks the reader to verify 
directly in the case of SU(n) that the centralizer of T is T and that N(T)/T is 
the permutation group on n entries, thus agreeing with the Weyl group for sl(n; C) 
computed in Sect. 7.7.1. 

The following “torus theorem” is a key result that underlies many of the 
developments in this chapter and the next two chapters. 


Theorem 11.9 (Torus Theorem). If K is a connected, compact matrix Lie group, 
the following results hold. 


1. If S and T are maximal tori in K, there exists an element x of K such that 
T =a5y 
2. Every element of K is contained in some maximal torus. 


The torus theorem has many important consequences; we will mention just two 
of these now. 


Corollary 11.10. Jf K is a connected, compact matrix Lie group, the exponential 
map for K is surjective. 


Proof. For any x € K, choose a maximal torus T containing x. Since the 
exponential map for T = (S')* is surjective, x can be expressed as the exponential 
of an element of the Lie algebra of T. oO 


Corollary 11.11. Let K be a connected, compact matrix Lie group and let x an 
arbitrary element of K. Then x belongs to the center of K if and only if x belongs 
to every maximal torus in K. 


Proof. Assume first that x belongs to the center Z(K) of K, and let T be any 
maximal torus in K. By the torus theorem, x is contained in a maximal torus S, 
and this torus is conjugate to T. Thus, there is some y € K such that S = yTy"!. 
Since x € S, we have x = yty! for some t € T, and thus t = y~!xy. But we are 
assuming that x is central, and so, actually, £ = x, showing that x belongs to T. 

In the other direction, assume x belongs to every maximal torus in K. Then for 
any y € K, we can find some torus T containing y, and this torus will also contain 
x. Since T is commutative, we conclude that x and y commute, showing that x is 
in Z(K). o 


The torus theorem follows from the following result. 


Lemma 11.12. Let T be a fixed maximal torus in K. Then every y € K can be 
written in the form 


y= x 


for some x € K andt ET. 
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If K = SU(n) and T is the diagonal subgroup of K, Lemma 1 1.12 follows easily 
from the fact that every unitary matrix has an orthonormal basis of eigenvectors. 
The proof of the general case of Lemma 11.12 requires substantial preparation and 
is given in Sect. 11.5. 


Proof of torus theorem assuming Lemma 11.12. Since each y € K can be written 
as y = xtx!, we see that y belongs to the maximal torus x7x"!. 

Next we show that every maximal torus S in K is conjugate to T, from which 
it follows any two maximal tori S; and S% are conjugate to each other. Suppose s 
is an element of S such that the subgroup of S generated by s is dense in S as in 
Proposition 11.4. Then we can choose some x € K andt € T such that s = xtx7! 
and t = x~!sx. Thus, x~!s*x = t* € T forall integers k. Since the set of elements 
of the form s* is dense in S, we must have x~'Sx C T. But since T is maximal, we 
actually have x~!Sx = T. o 


11.3 Mapping Degrees 


We now introduce a method that we will use in proving the torus theorem in 
Sect. 11.5. The current section requires greater familiarity with manifolds than 
elsewhere in the book. In addition to differential forms (Appendix B), we make 
use of the exterior derivative (“d”), the pullback of a form by a smooth map, and 
Stoke’s theorem. See Chapters 14 and 16 in [Lee] for more information. For our 
purposes, the main result of this section is Corollary 11.17, which gives a condition 
guaranteeing that a map between two manifolds of the same dimension is surjective. 
The reader who is not familiar with manifold theory should still be able to get an 
idea of what is going on from the example in Figures 11.3 and 11.4. 

If V is a finite-dimensional vector space over R, we may define an equivalence 
relation on ordered bases of V by declaring two ordered bases (v1,..., Un) and 


(v},...,U/,) to be equivalent if the unique linear transformation L mapping vj to vi 


on 

has positive determinant. The set of ordered bases for V then consists of exactly 
two equivalence classes. An orientation for V is a choice of one of these two 
equivalence classes. Once an orientation of V has been fixed, an ordered basis for V 
is said to be oriented if it belongs to the chosen equivalence class. An orientation 
on a smooth manifold M is then a continuous choice of orientation on each tangent 
space to M. A smooth manifold together with a choice of orientation is called an 
oriented manifold. 

We consider manifolds that are closed—that is, compact, connected, and without 
boundary—and oriented. We will be interested in smooth maps between two closed, 
oriented manifolds of the same dimension. 


Definition 11.13. Let X and Y be closed, oriented manifolds of dimension n > 1 
and let f : X — Y be a smooth map. A point y € Y is a regular value of f if 
for all x € X such that f(x) = y, the differential f(x) of f at x is invertible. 
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A point y € Y is a singular value of f if there exists some x € X such that 
f(x) = y and the differential f(x) of f at x is not invertible. 


It is important to note that if y is not in the range of f, then y is a regular value. 
After all, if there is no x with f(x) = y, then it is vacuously true that for every x 
with f(x) = y, the differential f(x) is invertible. 


Proposition 11.14. Let X,Y, and f be as in Definition 11.13. If y is a regular 
value of f, then y has only finitely many preimages under f. 


Proof. If f—'({y}) were infinite, the set would have to have an accumulation point 
Xo, by the assumed compactness of X. Then by continuity, we would have f(xo) = 
y. Since y is a regular value of f, then fs(xọ) would be invertible. But then the 
inverse function theorem would say that f is injective in a neighborhood of xo, 
which is impossible since every neighborhood of xo contains infinitely many points 
x with f(x) = y. Oo 


Saying that X and Y are oriented means that we have chosen a consistent 
orientation on each tangent space to X and to Y. If f : X — Y is smooth and 
the differential f(x) of f at x is invertible, then f(x) is either an orientation 
preserving or an orientation reversing map of T,(X) to Tyq)(Y). Since f’ is 
assumed to be continuous, if f is invertible and orientation preserving at x, it 
is invertible and orientation preserving in a neighborhood of x, and similarly if 
fx is invertible and orientation reversing at x. 


Definition 11.15. If y is a regular value of f, let the signed number of preimages 
of y denote the number of preimages, where x € f—!({y}) counted with a plus 
sign if f(x) is orientation preserving and with a minus sign if f(x) is orientation 
reversing. 


The main result of the section is the following. 


Theorem 11.16. Let X and Y be closed, oriented manifolds of dimension n > 1 
and let f : X — Y bea smooth map. Then there exists an integer k such that for 
every regular value y of f, the signed number of preimages of y is equal to k. 


If there are, in fact, any regular values of f, the integer k is unique and is called 
the mapping degree of f. (Actually, Sard’s theorem guarantees that every such f 
has a nonempty set of regular values, but we do not need to know this, since in 
our application of Theorem 11.16 in Sect. 11.5, we will find regular values of the 
relevant map “by hand.”) See the “Degree Theory” section in Chapter 17 of [Lee] 
for more information. 


Corollary 11.17. Let X,Y, and f be as in Theorem 11.16. If there exists a regular 
value y of f for which the signed number of preimages is nonzero, then f must 
map onto Y. 


Proof. If there existed some y’ that is not in the range of f, then y’ would be a 
regular value and the (signed) number of preimages of y’ would be zero. This would 
contradict Theorem 11.16. oO 
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Fig. 11.3 The graph of a 27 
map f from S! to S'. The 
signed number of preimages 
of each regular value is 1 


Figure 11.3 illustrates Theorem 11.16. The figure shows the graph of a map f 
from S! (which we think of as [0, 277] with ends identified) to itself. The singular 
values are the points marked a and b on the y-axis; all other values are regular. For 
two different values y and y’, we compute the signed number of preimages. The 
point y has three preimages, but f’ is negative at one of these, so that the signed 
number of preimages is 1 — 1 + 1 = 1. Meanwhile, the point y’ has one preimage, 
at which f’ is positive so that the signed number of preimages is 1. The mapping 
degree of f is 1 and, consistent with Corollary 11.17, f is surjective. Meanwhile, 
Figure 11.4 shows the same map in a more geometric way, as a map between two 
manifolds X and Y, each of which is diffeomorphic to S!. 

We now turn to the proof of Theorem 11.16; see also Theorem 17.35 in 
[Lee]. Using the inverse function theorem, it is not hard to show that the signed 
number of preimages and the unsigned number of preimages are both constant in a 
neighborhood of any regular value y. This result, however, does not really help us, 
because the set of regular values may be disconnected. (See Figure 11.3.) Indeed, 
Figure 11.3 shows that the unsigned number of preimages may not be constant; we 
need a creative method to show that the signed number of preimages is constant on 
the set of regular values. 

Our tool for proving this result is that of differential forms. If f : X —> Y is 
an orientation-preserving diffeomorphism, then for any n-form « on Y, the integral 
of f* (œ) over X will equal the integral of œ over Y. If, on the other hand, f is an 
orientation-reversing diffeomorphism, the integral of f* (œ) will be the negative of 
the integral œ. Suppose now that f : X — Y is a smooth map, not necessarily a 
diffeomorphism. Suppose that y is a regular value of f and that x1,...,xw are the 
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Fig. 11.4 The map indicated by the arrows is the same one as in Figure 11.3, but shown more 
geometrically 


elements of f—!({y}). Then we can find a neighborhood V of y such that f~!(V) is 
a disjoint union of neighborhoods U;,...,Uy of x1,..., Xy and such that f maps 
each U; diffeomorphically onto V. Furthermore, by shrinking V if necessary, we 
can assume that for each j, the differential f(x) is either orientation preserving at 
every point of U; or orientation reversing at every point of U;. Then for any n-form 
a supported in V, we see that 


[r@qtfa (11.3) 


where k is the signed number of preimages of y. 
If œ is chosen so that ty a Æ 0, then (11.3) becomes 


_ If) 
ha ` 


The right-hand side of (11.4) gives us an analytic method for determining the signed 
number of preimages. Our goal is to use (11.4) to show that k is constant on the set 
of regular values. To this end, we will establish a key result: The value of the right- 
hand side of (11.4) is unchanged if œ is “deformed” by pulling it back by a family 
of diffeomorphisms of Y. 

Suppose, then, that y and y’ are regular values of f with the signed number of 
preimages being k and k’, respectively. We will construct a family a,,0 < t < 1, of 
n-forms on Y such that œo is supported near y and q is supported near y’. For all 
t, the expression 


k (11.4) 


Jy f* (at) 


=a (11.5) 
y “t 
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makes sense, even if the support of a, contains singular values of f. Furthermore, 
the values of both integrals in (11.5) are—as we will show—independent of t. Thus, 
we will conclude that 


_ de f“ (a) = Ty f* (œ) — 


k 
Sy % Sym 


k! 


as claimed. 


Lemma 11.18. Suppose Y, is a continuous, piecewise-smooth family of 
orientation-preserving diffeomorphisms of Y, with Yo being the identity map. 
For any n-form a on Y, let a, = YU (œ). Then for all t, we have 


fo = fa (11.6) 
Y Y 


and 


f rw f ro (11.7) 


Proof. Saying that Y, is piecewise smooth means that it we can divide [0, T] into 
finitely many subintervals on each of which Y;(x) is smooth in x and f. Since Y, is 
continuous, it suffices to prove the result on each of the subintervals, that is, in the 
case where Y, (x) is smooth, which we now assume. The result (11.6) holds because 
W, is an orientation preserving diffeomorphism. 

To establish (11.7), we show that the left-hand side of (11.7) is independent of t. 
Note that 


f(a) = f° (UF @) = (Y, o f)“ 0). 
Thus, if we define g : X x [0, T] — Y by 


g(x,t) = U(f()), 


[ ren f teo= f cams O 


Using Stoke’s theorem and a standard result relating pullbacks and exterior deriva- 
tives, we obtain 


we have 
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[ ren-f reo = f ant f® 


=f sda, 
Xx(0,7] 


But since a is a top-degree form on Y, we must have da = 0, showing that 
Sy F*@r) = fy foo). oO 


Proof of Theorem 11.16. To complete the proof of Theorem 11.16, it remains only 
to address the existence of a continuous, piecewise smooth, orientation-preserving 
family Y, of diffeomorphisms of Y such that Wo is the identity and such that 
Wi(y’) = y. (Thus, if œ is supported near y, then Y7 (œ) will be supported near 
y’.) We actually only require this in the case that Y is a compact Lie group, in 
which case, the diffeomorphisms can easily be constructed using the group structure 
on Y. Nevertheless, we will outline an argument for the general result. Let U be a 
neighborhood of y’ that is a rectangle in some local coordinate system around y’. 
Then it is not hard to construct a family of diffeomorphisms of Y that are the identity 
on Y \ U and that map y’ to any desired point of U. (See Exercise 7.) 

If y € U, we are done. If not, we consider the set E of points z € Y such that 
y’ can be moved to z by a family of diffeomorphisms of the desired sort. If z € E, 
we have, by assumption, a family moving y’ to z. We can then use the argument in 
the preceding paragraph to move z to any point z’ in a neighborhood of z. Thus, Æ 
is open and contains y’. We now claim that E is closed. If z is a limit point of E, 
then in any neighborhood V of z, there is an element z of E. Thus, by the argument 
in the preceding paragraph, we can move 7z’ to z by a family of diffeomorphisms. 
Since ŒE is both open and closed and nonempty (because it contains y), Æ must be 
all of Y. o 


Proposition 11.19. Let X,Y, and f be as in Theorem 11.16, and suppose f has 
mapping degree k. Then for every n-form a on Y, we have 


| ro=kf a (11.8) 
Pd Y 


In Figure 11.4, for example, the map f indicated by the arrows has mapping 
degree 1. Thus, for every form @ on Y, the integral of f*(œ) over X is the same 
as the integral of œ over Y, even though f is not a diffeomorphism. When pulling 
back the part of œ between a and b, we get three separate integrals on X, but one of 
these integrals occurs with a minus sign, because fx is orientation reversing on the 
middle of the three intervals over [a, b]. 


Proof. We have already noted in (11.3) that if y is a regular value of f, there is 
a neighborhood U of y such that (11.8) holds for all œ supported in U. By the 
deformation argument in the proof of Theorem 11.16, the same result holds for any 
y in Y. Since Y is compact, we can then cover Y by a finite number of open sets U; 
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such that (11.8) holds whenever « is supported in U;. Using a partition of unity, we 
can express any form @ as a sum of forms œ; such that œ; is supported on U;, and 
the general result follows. o 


11.4 Quotient Manifolds 


Before coming the proof of the torus theorem, we require one more preparatory 
concept, that of the quotient of a matrix Lie group by a closed subgroup. Throughout 
this section, we assume that G is a matrix Lie group with Lie algebra g and that H 
is a closed subgroup of G with Lie algebra h. Even if H is not normal, we can still 
consider the quotient G/H as a set (the set of left cosets gH of H). We now show 
that G/H has the structure of a smooth manifold. We will let [g] denote the coset 
gH of H in G and we will let Q : G > G/H denote the quotient map. Recall that 
a topological structure on a set E is Hausdorff if for every pair of distinct points 
x,y € E, there exist disjoint open sets U and V in E with x € U and y € V. 


Lemma 11.20. Define a topology on G/H by decreeing that a set U in G/H is 
open if and only if QT! (U) is open in G. Then G/H is Hausdorff with respect to 
this topology. Furthermore, G/H has a countable dense subset. 


If we did not assume that H is closed, the Hausdorff condition for G/H would, 
in general, fail. 


Proof. If E C G is invariant under the right action of H, then Q7!(Q(E)) = E. 
From this observation, we can see that the map U +> Q7!(U) gives a bijection 
between the collection of open sets in G/H and the collection of right- H -invariant 
open sets in G. Suppose that [x] and [y] are disjoint points in G/H, that is, that xH is 
disjoint from yH. In light of the above description of open sets in G/H, it suffices to 
find disjoint open, right-H -invariant subsets A and B of G, with x € A and y € B. 

Since H is closed, yH is also closed. Thus, we can find a neighborhood U of 
x that does not intersect yH. We can then find a subneighborhood V C U of x 
such that V is compact and contained in U. We then consider the set V H , the set of 
points of the form vh with v € V and h € H. We claim that V H is closed in G. To 
see this, suppose v,, converges to some g € G, where v, € V and h, € H. Since 
V is compact, we can assume (after passing to a subsequence) that v, converges 
to some v € V, in which case, h, converges to the point h := v—'g. Since H is 
closed, h belongs to H and the limit point vh belongs to V H. Since V H is closed, 
the set B := (V H)° is open. Thus, B and A := VH are the desired disjoint, open, 
right- H -invariant sets in G. 

Finally, since G inherits its topology from the separable metric space M, (C) = 
R2” it follows that G has a countable dense subset E. Furthermore, the quotient 
map Q : G —> G/H is surjective and (by the definition of the topology on G/H) 
continuous. Thus, Q (E) will be dense in G/H. o 
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Fig. 11.5 The gray region $ 
indicates the set of points of 
the form gh with g € exp(U) 
and h € H. This set is 
diffeomorphic to exp(U) x H 


Lemma 11.21 (Slice Lemma). Let G be a matrix Lie group with Lie algebra g and 
let h be a subalgebra of g. Decompose g as a vector space as g = h @ f for some 
subspace f of g and define a map A : f x H — G by 


A(X,h) = e*h. 


Then there is a neighborhood U of 0 in f such that A is injective on U x H and A 
maps U x H diffeomorphically onto an open subset of G. In particular, if X, and 
X, are distinct elements of U, then e*' and e% belong to distinct cosets of G/H. 


The term “slice lemma” refers to the fact that the map sending X € U to e* 
slices across the different cosets of H. Figure 11.5 illustrates the slice lemma in the 
case in which G = S! x S! and H is the subgroup consisting of points of the form 
(ei? f erry. 

Proof. We identify the tangent spaces at the identity to both fx H and G with f ® b. 
If X(t) is a smooth curve in f passing through 0 at t = 0, we have 


Tixo, I) 
dt o dt 


t=0 


Meanwhile, if A(t) is a smooth curve in H passing through J at t = 0, we have 


= h'(0). 


t=0 


d d 
FACIO) = Tho 


t=0 
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From this calculation, and the linearity of the differential, we can see that the 
differential A, of A at (0, Z) is the identity map of fb to itself. Thus, by continuity, 
Ax(X, e) is invertible for X in some neighborhood U of 0 in f. 

Meanwhile, the map A commutes with the right action of H: 


A(X, hho) = A(X, h)ho. 


From this, it is easy to see that if A(X, T) is invertible, then A, (X, A) is invertible 
for all A. We conclude, then, that A,.(X, h) is invertible for all (X, h) € U x H. By 
the inverse function theorem, then, A maps a small neighborhood of each point in 
U x H injectively onto an open set in G. In particular, the image of U x H under 
A is an open subset of G. 

We must now show that by shrinking U as necessary, we can ensure that 
A is globally injective on U x H. By the inverse function theorem, there are 
neighborhoods U’ of 0 in f and V of J in H such that A maps U’ x V injectively 
into G. If we choose a small subneighborhood U” of 0 in f and X and X’ are in U”, 
then e~*’e¥ will be close to the identity in G. Indeed, if U” is small enough, then 
whenever e~*’ e* happens to be in H, the element e~* ‘e* will actually be in V. 

We now claim that A is injective on U” x H. To see this, suppose e¥h = e*'h’ 
with X, X’ e U” andh,h’ € H. Then 


Wh! = e™¥'e¥ e y, 
by our choice of U”. But then 
e¥ = e¥ (Wh), 


and since h'h™! € V, we must have X = X’ and I = h'h™!, by the injectivity of A 
on U’ x V. Thus, actually, X = X’ and h = h’, establishing the desired injectivity. 
o 


It is instructive to contemplate the role in the preceding proof played by the 
assumption that H is closed. It is evident from Figure 1.1 that the slice lemma can 
fail if H is not closed. (For example, even for very small nonzero X € f, the element 
e* can be in H.) On the other hand, even if H is not closed, Theorem 5.23 says that 
there is a new topology on H and an atlas of coordinate neighborhoods making H 
into a smooth manifold, in such a way that the inclusion of H into G is smooth. If 
we use this structure on H, the map A in Lemma 11.21 is smooth and much of the 
proof proceeds in the same way as when H is closed. The proof of global injectivity 
of A, however, breaks down because the new topology on H does not agree with 
the topology that H inherits from G. Thus, even if e~* ‘ex belongs to H and is very 
close to the identity in G, this element may not be close to the identity in the new 
topology on H. (See Figure 5.4.) Thus, we cannot conclude that e~* ‘e* isin V, and 
the proof of injectivity fails. 
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Theorem 11.22. If G is a matrix Lie group and H a closed subgroup of G, then 
G/H can be given the structure of a smooth manifold with 


dim(G/H) = dim G — dim H 


in sucha say that (1) the quotient map Q : G —> G/H is smooth, (2) the differential 
of Q at the identity maps Tı (G) onto Tin (G/Ħ) with kernel h, and (3) the left action 
of G on G/H is smooth. 


Proof. Let U C f be as in Lemma 11.21. Then for each g € G, let Ag be the map 
from U x H into G given by 


A,(X,h) = ge*h. 


Combining Lemma 11.21 with a translation by g, we see that Ag is a diffeomor- 
phism of U x H onto its image, and that A,(X,h) and A(X’, h’) are in distinct 
cosets of H provided that X # X’. Let W, be the (open) image of U x H under 
A, and let Vz = Q(W,), that is, 


V; = {[ge*] € G/H| X € U}. 


Then Q~! (Vz) = Wz, showing that V, is open in G/H. By the above properties of 
Ag, the map X +> [ge*] is an injective map of U onto Vy. 

We now propose to use the maps X > [ge*] as local coordinates on G/H, 
where U C f may be identified with R*, with k = dimf = dimg — dim þh. By 
the way the topology on G/H is defined, the map Q is continuous and a function 
f on G/H is continuous if and only if f o Q is continuous on G. Thus, the map 
X > QO(ge*) = [ge*] is continuous. Furthermore, if we compose the inverse map 
[ge*] > X with Q, we obtain the map ge¥ h > X. This map is continuous because 
it consists of the inverse of the diffeomorphism (X, h) > ge*h, combined with the 
map (X,h) bh X. 

Thus, G/H is locally homeomorphic to R*. Since, also, G/H is Hausdorff and 
has a countable dense subset (Lemma 11.20), we see that G/H is a topological 
manifold. 

Now, the coordinate patches [ge*] clearly cover G/H. If two such patches 
overlap, the change of coordinates map is the map X +> X’, where [ge*] = [g’e*’]. 
This map can be computed by mapping X to ge* € G and then applying (A,’)~! 
to write ge* as ge = gel’. Since (Ag’)~' is smooth, we see that the change 
of coordinates map is smooth. Thus, we may give a smooth structure to G/H using 
these coordinates. 

It is now a straightforward matter to check the remaining claims about the smooth 
structure on G/H. To see, for example, that Q is smooth, pick some g in G and write 
points near g as ge*h, with X € U and h € H. Then O(ge*h) = [ge*]. Thus, Q 
can be written locally as the inverse of A, followed by the map (X,h) +> [ge*], 
which is smooth in our local identification of G/H with U. The remaining claims 
are left as an exercise to the reader (Exercise 8). oO 
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Proposition 11.23. Suppose there exists an inner product on g that is invariant 
under the adjoint action of H, and let V denote the orthogonal complement of bh 
with respect to this inner product. Then we may identify the tangent space at each 
point |g] of G/H with V by writing v € Tig\(G/H) as 


d 
v= ge” , Xev. 
t t=0 


This identification of Tig\(G/H) with V is unique up to the adjoint action of H on V. 


Note that since the adjoint action of H on g is orthogonal and preserves b, this 
action also preserves V, by Proposition A.10. 


Proof. Since g = bh @ V, Point 2 of Theorem 11.22 tells us that every tangent vector 
v to G/H at the identity coset can be expressed uniquely as 


t=0 


For any [g] € G/H, we identify Tj.\(G/H) with Ti (G/H) = V by using the left 
action of g. Thus, each v € Tjgj(G/H) can be written uniquely as 


If we use a different element gh,h € H, of the same coset, we get a different 
identification of Tig (G/HĦ) with V, as follows: 


[ghe™] = [ghe*h-'] = [ge], (11.9) 


where X’ = Ad; (X). Differentiating (11.9) shows that the two identifications differ 
by the adjoint action of H. oO 


A volume form on a manifold M of dimension n is a nowhere-vanishing n-form 
on M. As we have already discussed in the proof of Theorem 4.28, each matrix Lie 
group G has a volume form that is invariant under the right action of G on itself. 
The same argument shows that G has a volume form invariant under the left action 
of G on itself. (For some groups G, it is possible to find a single volume form that 
is invariant under Doth the left and right action of G on itself, but this is not always 
the case.) We now address the existence of an invariant volume form on a quotient 
manifold. 


Proposition 11.24. If G is a matrix Lie group and H is a connected compact 
subgroup of G, there exists a volume form on G/H that is invariant under the left 
action of G. This form is unique up to multiplication by a constant. 
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In the case H = {I}, we conclude that there is a left-invariant volume form on G 
itself and that this form is unique up to a constant. (In this chapter, it is convenient 
to use a left-invariant volume form on G, rather than a right-invariant form as in 
Sect. 4.4.) 


Proof. Since H is compact, there exists an inner product on g that is invariant under 
the adjoint action of H. Let V denote the orthogonal complement of þh in g, so that 
V is invariant under the adjoint action of H. Since H is connected, the restriction 
to V of Ad; will actually be in SO(V) for all h € H. 

Now pick an orientation on V and let a be the standard volume form on V, that is, 
the unique one for which a@(e),...,@y) = 1 whenever (e1,..., eyn) is an oriented 
orthonormal basis for V. Since the action of Ad, on V has determinant 1, we have 


a(Ad,(v1),..., Ad, (vy)) = æ (v1,..., UN) 


forall vj,...,uy E V. 

Now, the tangent space to G/H at the identity coset is identified with V. 
We define a form on G/H as follows. At the identity coset, we take it to be a. 
At any other coset [g], we use the action of g € G to transport a from [/] to [g]. 
If [g] = [g’], then g’ = xh for h € H. The action of h on Ti(G/H) = V is the 
adjoint action, which preserves a. Thus, the resulting form at [g] is independent of 
the choice of g. 

Finally, to address the uniqueness, note that any two top degree forms on G/H 
must agree up to a constant at the identity coset [Z]. But the since the left action of 
G on G/H is transitive, the value of the form at [Z] uniquely determines the form 
everywhere. Thus, any two left-invariant forms on G/H must be equal up to an 
overall constant. o 


11.5 Proof of the Torus Theorem 


Having made the required preparations in Sects. 11.3 and 11.4, we are now ready to 
complete the proof of Theorem 11.9. It remains only to prove Lemma 11.12; to that 
end, we define a key map. 


Definition 11.25. Let T be a fixed maximal torus in K. Let 
®:Tx(K/T)>K 
be defined by 
D(t, [x]) = xt}, (11.10) 


where [x] denotes the coset xT in K/T. 


11.5 Proof of the Torus Theorem 327 


Note that if s € T, then since T is commutative, we have 


(xs)t (xs)! = xsts—'x 7! = xtX}, 


showing that ® is well defined as a map of T x (K/T) into K. Lemma 11.12 
amounts to saying that ® is surjective. Since T x (K/T) has the same dimension 
as K, we may apply Theorem 11.16 and Corollary 11.17. Thus, if there is even one 
regular value of ® for which the signed number of preimages is nonzero, ® must 
be surjective. Our strategy will be to find a certain class of points y € K for which 
we can (1) determine all of the preimages of y under ®, and (2) verify that ®, is 
invertible and orientation preserving at each of the preimages. 


Lemma 11.26. Lett € T be such that the subgroup generated by t is dense in T 
(Proposition 11.4). Then ®~'({t}) of t consists precisely of elements of the form 
(x~!tx, [x]) with [x] belonging to W = N(T)/T. In particular, if xx! = t for 
some x € K ands € T, then s must be of the form s = w`! -t for some w € W. 


Note that if x and y in N(T) represent distinct elements of W = N(T)/T, then 
[x] and [y] are distinct elements of K/T. Thus, the lemma tells us that there is a 
one-to-one correspondence between ®!({r}) and W. 


Proof. If x € N(T), then x~'tx € T and we can see that ®(x~!tx, [x]) = t. In the 
other direction, if xsx~! = t, then 

cM xs s" eT 
for all integers m, so that x~'Tx C T by our assumption on f. Since x~! Tx is again 
a maximal torus, we must actually have x7!Tx = T and, thus, T = xTx"!, showing 


that x € N(T). Furthermore, since xsx~! = t, we have s = x~!tx = w7! - t, where 
w= [x]. o 


We now compute the differential of ®. Using Proposition 11.23 with (G, H) 
equal to (T, {1}), (K,T), and (K, {7}), we identify the tangent space at each point 
in T x (K/T) with t ® f = € and the tangent space at each point in K with €. 
Since we are trying to determine the signed number of preimages of ®, we must 
choose orientations on T x (K/T) and on K. To this end, we choose orientations 
on the vector spaces t and f and use the obvious associated orientation on € = t © f. 
We then define orientations on T x (K/T) and K using the above identifications 
of the tangent spaces with t @ f = €. The identification of the tangent spaces to 
K/T with f is unique up to the adjoint action of T (Proposition 11.23). Since T is 
connected, this action will have positive determinant, showing that the orientation 
on T x (K/T) is well defined. Recall that the (orthogonal) adjoint action of T on € 
preserves t and thus, also, f := tt, 


Proposition 11.27. Let (t,[x]) be a fixed point in T x (K/T). If we identify the 
tangent spaces to T x (K/T) and to K with t ® f = &, then the differential of ® at 
(t, [x]) is represented by the following operator: 
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I 0 
D, = (Ad) (Gag a) (11.11) 
= 


where Ad'_, denotes the restriction of Ad,-1 to f. 


Proof. For H € t, we compute that 


d 
tH 
®, (te Do = T“ x 


xtHx ! 


II 


= (xt!) (Ad; (H)). 


Since we identify the tangent space to K at xtx™! with € using the left action of 
xtx—', we see that ®,((H,0)) = Ad, (H). 
Meanwhile, if X € f, we compute that 


ll 
Se 
F 


Da (t, ke” Dlo 


1 


xXtx 7! — xtXx7! 


II 


xtX! (xt Kix — x`! Xx) 


II 


(xtx~')[Ad,(Ad,-1(X) — X)], 


II 


so that ®,((0, X)) = Ady (Ad,-ı (X) — X)). These two calculations, together with 
the linearity of the differential, establish the claimed form of ®,. oO 


We now wish to determine when ®,.(f, [x]) is invertible. Since Ad, is invertible, 
the question becomes whether Ad’_ , — Z is invertible. When ®,.(t, [x]) is invertible, 
we would like to know whether this linear map is orientation preserving or 
orientation reversing. In light of the way our orientations on T x (K/T) and K 
have been chosen, the orientation behavior of ®, will be determined by the sign of 
the determinant of ®, as a linear map of € to itself. Now, since K is connected and 
our inner product on £ is Adx-invariant, we see that Ad, € SO(€) for every x. Thus, 
det(Ad,.) = 1, which means that we only need to calculate the determinant of the 
second factor on the right-hand side of (11.11). 


Lemma 11.28. Fort € T, let Adi, denote the restriction of Ad,- to f. 


1. If t generates a dense subgroup of T, then Ad’, — I is an invertible linear 
transformation of f. 
2. Forallw € W andt € T, we have 
det(Ad’ -—I)= det(Ad’_, —T). 


wt! 
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Proof. The operator Ad’, — I is invertible provided that the restriction of Ad,-1 
to f does not have an eigenvalue of 1. Suppose, then, that Ad,-1(X) = X for some 
X € f. Then for every integer m, we will have Ad;m(X) = X. If t generates a dense 
subgroup of T, then by taking limits, we conclude that Ad,(X) = X for all s € T. 
But then for all H € t, we have 


d 
[H, X] = Aden (X)|) =0. 
dt vas 


Since t is maximal commutative (Proposition 11.7), we conclude that X € f N t = 
{0}. Thus, there is no nonzero X € f for which Ad,- (X) = X. 
For the second point, if w € W is represented by x € N(T), we have 


Ag 1-1 = Ad 1-1 
= Ad, (Ad); — I)Ad,-1. 
Thus, Ad’, ıı — Z and Adi, — Í are similar and have the same determinant. oO 


We are now ready for the proof of Lemma 11.12, which will complete the proof 
of the torus theorem. 


Proof of Lemma 11.12. By Proposition 11.4, we can choose t € T so that the 
subgroup generated by ¢ is dense in T. Then by Lemma 11.26, the preimages 
of ¢ are in one-to-one correspondence with elements of W. Furthermore, by 
Proposition 11.27 and Point | of Lemma 11.28, ®, is nondegenerate at each 
preimage of t. Finally, by Point 2 of Lemma 11.28, ©, has the same orientation 
behavior at each point of &~'({t}). Thus, ¢ is a regular value of ® and the signed 
number of preimages of ¢ under ® is either |W| or — |W | . It then follows from 
Corollary 11.17 that ® is surjective, which is the content of Lemma 11.12. o 


Corollary 11.29. The Weyl group W is finite and the orientations on T x (K/T) 
and K can be chosen so that the mapping degree of ® is |W | , the order of the Weyl 


group. 


Proof. If t generates a dense subgroup of T, then by Lemma 11.26, 7! ({t}) is in 
one-to-one correspondence with W. Furthermore, Point 1 of Lemma 11.28 then tells 
us that such at is a regular value of ®. Thus, by Proposition 11.14, #7! ({r}) is finite, 
and W is thus also finite. Meanwhile, we already noted in the proof of Lemma 11.12 
that ® has mapping degree equal to + |W |. By reversing the orientation on K as 
necessary, we can ensure that the mapping degree is |W |. oO 
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11.6 The Weyl Integral Formula 


In this section, we apply Proposition 11.19 to the map ® to obtain an integration 
formula for functions f on K. Of particular importance will be the special case in 
which f satisfies f(yxy-!) = f(x) for all x, y € K. This special case of the Weyl 
integral formula will be a main ingredient in our analytic proof of the Wey] character 
formula in Sect. 12.4. 

Recall that we have decomposed € as t @ f, where f is the orthogonal complement 
of t, and that the adjoint action of T on £ preserves both t and f. Define a function 
p:T => Rby 


p(t) = det(Ad!—ı — 1), (11.12) 


where Ad'_, is the restriction of Ad,—: to f. Using Proposition 11.24, we can 
construct volume forms on K, T, and K/T that are invariant under the left action 
of K,T, and K, respectively. Since each of these manifolds is compact, the total 
volume is finite, and we can normalize this volume to equal 1. 


Theorem 11.30 (Weyl Integral Formula). For all continuous functions f on K, 
we have 


sad, = 
f tœ p= ya ee hpt jdid (11.13) 


where dx, dt, and d |y] are the normalized, left-invariant volume forms on K,T, and 
K/T, respectively and |W | is the order of the Weyl group. 


In Sect. 12.4, we will compute p explicitly and relate it to the Weyl denominator. 


Proof of Theorem 11.30, up to a constant. Since ® has mapping degree |W], 
Theorem 11.16 tells us that 


W dx = o* d. 
w| [ F(x) dx Í an POW 
=f (f o ®) &* (dx), (11.14) 
TXx(K/T) 


for any smooth function f. Since, by the Stone—Weierstrass theorem (Theorem 
7.33 in [Rud1]), every continuous function on K can be uniformly approximated 
by smooth functions, (11.14) continues to hold when f is continuous. Thus, to 
establish (11.13), it suffices to show that ®* (dx) = p(t) d[y] A dt. 

Pick orthonormal bases H,,..., H, for tand X,..., Xx for f. Then by the proof 
of Proposition 11.24, we can find invariant volume forms œ; on T, œ> on K/T, and 
p on K such that at each point, we have 


a(M,..., H,) = a2(X%,..., Xn) = BUM,..., A, X1,..., XN) = 1, 
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so that 
(ay A a2)(M,..., Hp, X1,..., XN) =1. 


By the uniqueness in Proposition 11.24, œ1,œ2, and f will coincide, up to 
multiplication by a constant, with the normalized volume forms dt, d[y], and dx, 
respectively. 

Now, at each point, the matrix of ®,, with respect to the chosen bases for T (T x 
(K/T)) and for T(K), is given by the matrix in (11.11). Thus, using the definition 
of the pulled-back form ®* (8), we have 


®*(8)(A),..., Hr, X1,..., XN) 
= B(®.(A1),..., Px (H,), Pa (X1)... Px (Xn) 
= det(®*)B(M,..., Hy, X1,..., XN) 
= p(t)(a, Aa2)(M,..., H,,X1,..., XN), 


where in the third line, we use (B.1) in Appendix B. Since dx coincides with 6 up 
to a constant and d[y] A dt coincides with œ; A a up to a constant, (11.14) then 
becomes 


— —1 
wi f f)dr=C Í n POSOD a dt 


It remains only to show that C = 1. We postpone the proof of this claim until 
Sect. 12.4. o 


It is possible to verify that C = 1 directly, using Lemma 11.21. According to 
that result, if U is a small open set in K/T, then q7! (U) is diffeomorphic to U x T. 
It is not hard to check that under the diffeomorphism between q7! (U) and U x T, 
the volume form 6 decomposes as the product of œ) and œ2. Thus, for any (nice 
enough) E C U, the volume of E x T = q7! (E) is the product of the volume of E 
and the volume of T. From this, it is not hard to show that for any (nice enough) set 
E C K/T, the volume of q7! (E) equals the volume of E (with respect to a) times 
the volume of T (with respect to œ1). In particular, the volume of q7! (K/T) = K 
is the product of the volume of K/T and the volume of T. Thus, if we choose our 
inner products on t and on f so that the volume forms a; and œz are normalized, the 
volume form £ will also be normalized. In that case, the above computation holds on 
the nose, without any undetermined constant. Since we will offer a different proof 
of the normalization constant in Sect. 12.4, we omit the details of this argument. 

We now consider an important special case of Theorem 11.30. 


Definition 11.31. A function f : K — C is called a class function if f (yxy!) = 
f(x) forall x,y € K. 
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That is to say, a function is a class function if it is constant on each conjugacy 
class. 


Corollary 11.32. If f is a continuous class function on K, then 
1 
| F(x) dx = al p(t) f(t) dt. (11.15) 
K [WI Jr 


Proof. If f is a class function, then f(yty"!) = f(t) for all y € K andt € T. 
Since the volume form on K/T is normalized, we have 


Í. for) diy] = £0. 


in which case, the Wey] integral formula reduces to (11.15). Oo 


Example 11.33. Suppose K = SU(2) and T is the diagonal subgroup. Then 
Corollary 11.32 takes the form 


f(x) dx = al f(diag(e’’, e7'*)) 4 sin? (0) AA (11.16) 
SU(2) Dep 25 Qn 


where |W| = 2 and the normalized volume measure on T is d0/(27). 


Note that if f = 1, both sides of (11.16) integrate to 1. See also Exercise 9 in 
Chapter 12 for an explicit version of the Weyl integral formula for U(7). 


Proof. If we use the Hilbert-Schmidt inner product on su(2), the orthogonal 
complement of t in Su(2) is the space of matrices X of the form 


x=( 0 ae) 
—x+iy 0 


with x, y € R. Direct computation then shows that if t = diag(e’®, e~"”), then 


Ad,- (X) = ( , eG + a , 


e”? (—x + iy) 0 
Thus, Ad,- acts as a rotation by angle —26 in C = R?. It follows that 


det(Ad,—1 — E eee sin(26) j. 


— sin(20) cos(20)— 1 


This determinant simplifies by elementary trigonometric identities to 4sin? 0. 
Finally, since W = {J,—J}, we have |W | = 2. o 
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Note that the matrix diag(e’’,e~'’) is conjugate in SU(2) to the matrix 


diag(e~"?, e”): 
e? 0O\ ( 01) (e* 0 0-1 
o ef} \-10 o eji 0j’ 


Thus, if f is a class function on SU(2), the value of f at diag(e~’, e’’) is the 
same its value at diag(e’’,e~"’). We may therefore rewrite the right-hand side 
of (11.16) as 


[ f(diag(e’®, e~'*)) 4 sin? (0) 2A (11.17) 
0 27 


Meanwhile, recall from Exercise 5 in Chapter 1 that SU(2) can be identified 
with the unit sphere S? C C?. By Exercise 9, a function f on SU(2) is a class 
function if and only if the associated function on S? depends only on the polar 
angle. Furthermore, for 0 < 0 < ~, the polar angle associated to diag(e’’, e~’°) 
is simply 0. With this perspective, (11.17) is simply the formula for integration in 
spherical coordinates on $°, in the special case in which the function depends only 
on the polar angle. (Apply the m = 4 case of Eq. (21.15) in [Has] to a function that 
depends only on the polar angle.) 


11.7 Roots and the Structure of the Weyl Group 


In the context of compact Lie groups, it is convenient and customary to redefine the 
notion of “root” by a factor of i so that the roots will now live in t rather than in it. 


Definition 11.34. An element a of t is real root of g with respect to t if a # 0 and 
there exists a nonzero element X of g such that 


[H, X] = i (œ, H) X 


for all H € t. For each real root œ, we consider also the associated real coroot Hy 
given by 


When working with the group K and its Lie algebra €, the use of real roots (and, 
later, real weights for representations) is convenient because it makes the factors of 
i explicit, rather than hiding them in the fact that the roots live in it. If, for example, 
we wish to compute the complex conjugate of the expression e'(”-”), where « is a 


real root and H is in t, the explicit factor of i makes it obvious that the conjugate is 
ei (a,H) : 
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If our compact group K is simply connected, then g := €c is semisimple, by 
Proposition 7.7. In general, ¢ decomposes as £ = €; @3 where 3 is the center of € and 
where gı := (€1)c is semisimple. (See the proof of Proposition 7.6.) Furthermore, 
as in the proof of Theorem 7.35, every maximal commutative subalgebra t of € will 
be of the form tı ® 3, where tı is a maximal commutative subalgebra of €. All 
the results from Chapter 7 then apply—with slight modifications to account for the 
use of real roots—except that the roots may not span t. Nevertheless, the roots form 
a root system in their span, namely the space tı. Throughout the section, we will 
let R C t denote the set of real roots, A a fixed base for R, and R* denote the 
associated set of positive (real) roots. 

Now that we have introduced the (real) roots for K, it makes sense to compare 
the Weyl group in the compact group sense (the group N(T)/T) to the Weyl group 
in the Lie algebra sense (the group generated by reflections about the hyperplanes 
perpendicular to the roots). As it turns out, these two groups are isomorphic. It is 
not hard to show that for each reflection there is an associated element of the Weyl 
group. The harder part of the proof is to show that these elements generate all of 
N(T)/T. This last claim is proved by making a clever use of the torus theorem. 


Proposition 11.35. For eacha € R, there is an element x in N(T) such that 
Ad, (Ha) = —Hy 


and such that 


Ad, (H) = H 


for all H e€ t for which (a, H) = 0. Thus, the adjoint action of x on t is the 
reflection Sy about the hyperplane orthogonal to a. 


Proof. Choose Xq and Yq as in Theorem 7.19, with Yy = X% . Then (Xq — Ya)“ = 
—(Xa — Yq), from which it follows that X, — Yy € £. Let us define x € K by 


x = exp| 5 (Xa — Ya) | 


(where z is the number 3.14--- , not a representation). Then by the relationship 
between Ad and ad (Proposition 3.35), we have 


Ad,(H) = exp{ = (adx, = ady,)|(H) (11.18) 


for all H € t. If (æ, H) = 0, then (ady, — ady, )(H) = 0, so that Ady(H) = H. 

Consider, then, the case H = H,,. In that case, the entire calculation on the 
right-hand side of (11.18) taking place in the subalgebra s” = (Xw, Ya, Hy) of g. In 
s5“, the elements Xa — Yy,iXqy + iYo + Hy, and iX, + iY, — Hy are eigenvectors 
for ady, — ady, with eigenvalues 0, 27, and —2i, respectively. Since H, is half the 
difference of the last two vectors, we have 
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T 
exp| 5 (adx, — adr, ) |(Ha) 
= e" (iXa + Yq + Ha)/2— e" (iXa + Yq — Ha)/2 
a (11.19) 


Thus, Ad, maps Hy to —H, and is the identity on the orthogonal complement of «œ. 
oO 


See Exercise 10 for an alternative approach to verifying (11.19). We now proceed 
to show that the Weyl group is generated by the reflections sy,a@ € R. We let Z(T) 
denote the centralizer of 7, that is 


Z(T) = {x € K|xt=n, forallt € T}. 


Theorem 11.36. If T is a maximal torus in K, the following results hold. 


1. Z(T) =T. 
2. The Weyl group acts effectively on t and this action is generated by the reflections 
Sa,œ& € R, in Proposition 11.35. 


Since Z(T) = T, it follows that T is a maximal commutative subgroup of T 
(i.e., there is no commutative subgroup of K properly containing T). Nevertheless, 
there may exist maximal commutative subgroups of K that are not maximal tori; 
see Exercise 5. 

It is not hard to verify Theorem 11.36 directly in the case of SU(n); see 
Exercise 3. The following lemma is the key technical result in the proof of 
Theorem 11.36 in general. 


Lemma 11.37. Suppose S is a connected, commutative subgroup of K. If x belongs 
to Z(S), then there is a maximal torus S’ containing both S and x. 


Proof. Let x be in Z(S), let B be the subgroup of K generated by S and x, and let 
B be the closure of B. We are going to show that there is an element y of B such 
that the group generated by y is dense in B. The torus theorem will then tell us that 
there is a maximal torus S’ containing y and, thus, both S and x. 

Since B is compact and commutative, Theorem 11.2 implies that the identity 
component Bo of B is a torus. Since B is compact, it has only finitely many 
components (Exercise 15 in Chapter 3), which means that the quotient group B / Bo 
is finite. 

Now, every element y of B is the limit of sequence of the form x”* 5; for some 
integers ną and elements s € S. Thus, for some large k, the element x”* s, will 
be in the same component of B as y. But since S is connected, x”* must also be 
in the same component of B as y. It follows that [y] and [x”*] represent the same 
element of the quotient group | B /\ By. We conclude, then, that B / Bo is a cyclic group 
generated by [x]. Since also B/ Bo is finite, it must be isomorphic to Z/m for some 
positive integer m. 

It follows that x” belongs to the torus Bo. Choose some t € Bo such that the 
subgroup generated by ¢ is dense in Bo(Proposition 11.4), and choose g € Bo so that 


336 11 Compact Lie Groups and Maximal Tori 


g™ = x "t. (Since the exponential map for the torus Bo is surjective, x™t € Bo 


has an mth root in Bo.) Now set y = gx, so that y is in the same component of B 
as x. Since B is commutative, we have 


which means that the set of elements of the form y”” = t” is dense in Bo. Now, 
since B/ By is cyclic with generator [x], each component of B is of the form x* Bo 
for some k. Furthermore, the set of elements of the form y””"t* = x*r” is dense in 
X k Bo š 

We see, then, that the group generated by y contains a dense subset of each 
component of B and is, thus, dense in B. By the torus theorem, there is a maximal 
torus S’ that contains y. It follows that S’ must contain B and, in particular, both S 
and x. o 


Figure 11.6 illustrates the proof of Lemma 11.37 in the case where B / Bo is 
cyclic of order 5. We choose y in the same component of B as x in such a way that 


tisy generates a dense subgroup of Bo. Then y generates a dense subgroup of 
the whole group B. 
(4 
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Fig. 11.6 The element y generates a dense subgroup of B 
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Proof of Point 1 of Theorem 11.36. If we apply Lemma 11.37 with S = T, we see 
that any element x of Z(T) must be contained in a maximal torus S” that contains 
T. But since T is also maximal, we must have S’ = T, so that x € T. oO 


Lemma 11.38. Suppose C is the fundamental Weyl chamber with respect to A and 
that w is an element of W that maps C to C. Then w = 1. 


For any w € W, the action of W constitutes a symmetry of the root system R, 
that is, an orthogonal linear transformation that maps R to itself. In some cases, 
such as the root system Bz, there is no nontrivial symmetry of R that maps C to C. 
In the case of Az, on the other hand, there is a nontrivial symmetry of R that maps 
C to C, namely the unique linear transformation that interchanges a, and a2. (This 
map is just the reflection about the line through the root a3 = a + a.) The lemma 
asserts that although this map is a symmetry of W, it is not given by the action of 
a Weyl group element. In the Az case, of course, we have an explicit description of 
the Weyl group, and we can easily check that there is no w € W with w- a); = a 
and w : @2 = a. (See Figure 11.7. where the Wey] group is the symmetry group of 
the indicated triangle.) Nevertheless, we need an argument that works in the general 
case. 

The idea of the proof is as follows. We want to show that if x € N(T) and 
Ad,.(C) C C, then x € T. The idea is to show that x must commute with some 
nonzero H € t, and that this H can be chosen to be “nice.” Jf H could be chosen 
so that the group exp(tH) were dense in T, then x would have to commute with 


ee 
E 


y 


Fig. 11.7 The reflection about the line through a3 is a symmetry of R that maps C to itself, but 
that is not an element of W 
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every element of T, so that x would belong to Z(T) = T. Although we cannot, in 
general, choose H to be as nice as that, we can choose H to be in the interior of C, 
which means that (a, H) Æ 0 for all æ € R. This turns out to be sufficient to show 
that x € T. 


Proof. Let x € K be a representative of the element w € W. Take any Ho in the 
interior of C and average Hp over the action of the (finite) subgroup of W generated 
by w. The resulting vector H is still in the interior of C (which is convex) and is 
now fixed by w, meaning that Ad, (H) = H. Thus, x commutes with every element 
of the one-parameter subgroup S := {exp(tH)|t € R}. By Lemma 11.37, there is a 
maximal torus S’ containing x and S. Suppose, toward a contradiction, that x is not 
in T. Then S’ cannot equal T, and, since S’ is maximal, S’ cannot be contained in 
T. Thus, there is some X € ¢ that is in the Lie algebra 5’ of S’ but not in t, and this 
X commutes with exp(tH),t € R, and hence with H. 

On the other hand, suppose we decompose X € € C g as a sum of an element 
of and elements of the various root spaces gy. Now, [H, X] = 0 and (a, H) is 
nonzero for all a, since H is in the interior of C. Thus, the component of X in 
each gg must be zero, meaning that X € h N € = t, which is a contradiction. Thus, 
actually, x must be in 7, which means that w is the identity element of W. oO 


Proof of Point 2 of Theorem 11.36. We let W’ C W be the group generated by 
reflections. By Proposition 8.23, W’ acts transitively on the set of Weyl chambers. 
Thus, for any w € W, we can find w’ € W’ mapping W(C) back to C, so that w'w 
maps C to C. Then by Lemma 11.38, we have that w/w = 1, which means that 
w = (w’)! belongs to W’. Thus, every element of W actually belongs to W”. o 


Theorem 11.39. If two elements s and t of T are conjugate in K, then there exists 
an element w of W such thatw:s =t. 


We may restate the theorem equivalently as follows: If t = xsx—! for some x € 
K, then ¢ can also be expressed as t = ysy_! for some y € N(T). 


Proof. Suppose s and t are in T and t = xsx_! for some x € K. Let Z(t) be 
the centralizer of t. Since, xux™! commutes with £ = xsx! for all u € T, we see 
that x7x-! C Z(t). Thus, both T and x7x~! are tori in Z(t). Actually, since T and 
xTx—' are connected, they must be contained in the identity component Z(t)o of 
Z(t). Furthermore, since T and xTx~! are maximal tori in K, they must be maximal 
tori in Z(t)o. Thus, we may apply the torus theorem to Z (t)o to conclude that there 


is some z E€ Z(t)o such that 
zxTx iz! = T. (11.20) 


Now, (11.20) says that zx is in the normalizer N(T) of T. Furthermore, since 
z € Z(t)o commutes with t, we have 


(zx)s(zx)) = zas z = zz =t. 


Thus, y := zx is the desired element of N (T) such that t = ysy™!. o 
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Corollary 11.40. If f is a continuous Weyl-invariant function on T, then f extends 
uniquely to a continuous class function on K. 


Proof. By the torus theorem, each conjugacy class in K intersects T in at least one 
point. By Theorem 11.39, each conjugacy class intersects T in a single orbit of W. 
Thus, if f is a W-invariant function on T, we can unambiguously (and uniquely) 
extend f to a class function F on K by making F constant on each conjugacy class. 

It remains to show that the extended function F is continuous on K. Suppose, 
then that (xn) is a sequence in K converging to some x. We can write each x, as 
Xn = Vntn Vos with y, E€ K and t, € T. Since both K and T are compact, we 
can—after passing to a subsequence—assume that y, converges to some y € K 
and ź, converges to some ź € T. It follows that 


x= lim x, = lim Viney, = aad 
noo noo 


Now, by our construction of F, we have F (xn) = f(t,) and F(x) = f(t). Thus, 
since f is assumed to be continuous on T, we see that 


F(xn) = fta) > fA) = F(x), 


showing that F is continuous on K. oO 


11.8 Exercises 


1. Let T denote the set of all vectors in R? that can be expressed in the form 
a(i, 1) + b(3, 1) + c(2, —4), 


for a,b,c € Z. Then T is a subgroup of R? and T is discrete, since it is 
contained in Z?. Find linearly independent vectors vı and v2 in T such that 
T consists precisely of the set of integer linear combinations of vı and v2. 
(Compare Lemma 11.3.) 
2. (a) Show that every continuous homomorphism from S! to S! is of the form 
u œ> u” for some m E Z. 
Hint: Use Theorem 3.28. 
(b) Show that every continuous homomorphism from (S!)* —> S! is of the 
form 


mı Mk 
(uy, ..., Uk) He Uy su, 
for some integers ™1,..., Mk. 


3. Consider the group SU(n), with maximal torus T being the intersection of 
SU(n) with the space of diagonal matrices. Prove directly (without appealing 
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to Theorem 11.36) that Z(T) = T and that the N(T)/T is isomorphic to the 
Weyl group of the root system A„—1. (Compare Sect. 7.7.1.) 
Hint: Imitate the calculations in Sect. 6.6. 


. Give an example of closed, oriented manifolds M and N of the same dimension 


and a smooth map f : M — N such that f has mapping degree zero but f is, 
nevertheless, surjective. 


. Let K = SO(n), where n > 3. Let H be the (commutative) subgroup of K 


consisting of the diagonal matrices in SO(n). (Of course, the diagonal entries 
have to be +1 and the number of diagonal entries equal to —1 must be even.) 
Show that H is a maximal commutative subgroup of K and that H is not 
contained in a maximal torus. 

Hint: Use Proposition A.2. 

Note: This example shows that in Lemma 11.37, the assumption that S be 
connected cannot be omitted. (Otherwise, we could take S = H and x = I 
and we would conclude that there is a maximal torus S’ containing H.) 


. Suppose K = SU(n) and H is any commutative subgroup of K. Show that H 


is conjugate to a subgroup of the diagonal subgroup of K and thus that H is 
contained in a maximal torus. This result should be contrasted with the result 
of Exercise 5. 


. (a) For any interval (a,b) C R and any x, y € (a,b), show that there exists a 


smooth family of diffeomorphisms f; : (a,b) —> (a,b),0 < t < 1, with 
the following properties. First, fo(z) = z for all z. Second, there is some 
e€ > 0 such that f (z) = z for all z € (a,a + £) and for all z € (b — e, b). 
Third, f\(x) = y. 

Hint: Take 


filz) =z+ tf sw du 


for some carefully chosen function g. 

If R C R” is an open rectangle and x and y belong to R, show that there 
is a smooth family of diffeomorphisms Y, : R — R such that (1) Yo is the 
identity map, (2) each W, is the identity in a neighborhood of ðR, and (3) 


Wi(x) = y. 


(b 


wm 


. (a) Show that the left action of G on G/H is smooth with respect to 


the collection of coordinate patches on G/H described in the proof of 
Theorem 11.22. 

Show that the kernel of the differential of the quotient map Q : G > G/H 
at the identity is precisely b. 


(b 


wm 


. According to Exercise 5 in Chapter 1, each element of SU(2) can be written 


uniquely as 


11.8 Exercises 341 


10. 


where (a, 8) € C? belongs to the unit sphere S*. For each U € SU(2), let vy 
denote the corresponding unit vector (œ, 6). Now, the angle 0 between (a, f) € 
S? and the “north pole” (1, 0) satisfies 


cos @ = Re(((a, B), (1,0))) = Re(a), 


and this relation uniquely determines 0, if we take 0 < 6 < x. In spherical 
coordinates on S°, the angle @ is the polar angle. 


(a) Suppose U € SU(2) has eigenvalues e’? and e~’?. Show that 
Re((vy, (1, 0))) = cos 0. 


Hint: Use the trace. 
(b) Conclude that U; and U) are conjugate in SU(2) if and only if vy, and vy, 
have the same polar angle. 


In this exercise, we give an alternative verification of the identity (11.19). Since 
the left-hand side of (11.19) is expressed in purely Lie-algebraic terms, we may 
do the calculation in any Lie algebra isomorphic to (X«, Yu, Ha) , for example, 
in sl(2; C) itself. That is to say, it suffices to prove the formula with Xe = 
X,Y, = Y, and Hy = H, where X,Y, and H are the usual basis elements for 
sl(2; ©). 

Show that 


exp| 5 (adx - adr) |(H poet One IX = H, 


as claimed. 


Chapter 12 
The Compact Group Approach 
to Representation Theory 


In this chapter, we follow Hermann Weyl’s original approach to establishing the 
Weyl character formula and the theorem of the highest weight. Throughout the 
chapter, we assume K is a connected, compact matrix Lie group, with Lie algebra 
€. Throughout the chapter, we let T denote a fixed maximal torus in K and we let 
t denote the Lie algebra of 7. We let R denote the set of real roots for € relative 
to t (Definition 11.34), we let A be a fixed base for R, and we let R denote the 
positive real roots relative to R. We also let W := N(T)/T denote the Weyl group 
for K relative to T. In light of Theorem 11.36, W is isomorphic to the subgroup of 
O(t) generated by the reflections about the hyperplanes orthogonal to the roots. 


12.1 Representations 


All representations of K considered in this chapter are assumed to be finite 
dimensional and defined on a vector space over C, unless otherwise stated. Although 
we are studying in this chapter the representations of the compact group K, it is 
convenient to describe the weights of such a representation in terms of the associated 
representation of g = fc. Since we are using real roots for g, we will also use real 
weights for representations of K. 


Definition 12.1. Let (II, V) be a finite-dimensional representation of K and x the 
associated representation of g. An element A of t is called a real weight of V is 
there exists a nonzero element v of V such that 


u(H)v =i (A, H)v (12.1) 


for all H € t. The weight space with weight A is the set of all v € V 
satisfying (12.1) and the multiplicity of À is the dimension of the corresponding 
weight space. 
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We now consider some elementary properties of the weights of a representation. 


Proposition 12.2. If (II, V) is a finite-dimensional representation of K, the real 
weights for II and their multiplicities are invariant under the action of the Weyl 


group. 


Proof. Following the proof of Theorem 6.22, we can show that if w e W is 
represented by x € N(T), then II(x) will map the weight space with weight A 
isomorphically onto the weight space with weight w- À. oO 


The weights of representation of K satisfy an integrality condition that does not, 
in general, coincide with the notion of integrality in Definition 8.34. 


Definition 12.3. Let T be the subset of t given by 
T= {H et" =71}. 


We refer to T as the kernel of the exponential map for t. 


The set I should, more precisely, be referred to as the kernel of the exponential 
map scaled by a factor of 27. Note that if w € W is represented by x € N(T), then 
for all H € T, we have 


A cHx—! = 
e% WH = e?7*Hx = xe” x Di I. 


Thus, T is invariant under the action of W on t. 


Definition 12.4. An element of À of t is an analytically integral element if 
(,H) eZ 
for all H inl’. An element A of t is an algebraically integral element if 


(A, a) 


(A, Ha) = aaah 


EZ 


for each real root a. An element A of t is dominant if 
(A,a) > 0 


for alla € A. Finally, if A = {a1,...,a@,} and u and A are two elements of t, we 
say that u is higher than A if 


BA = CO +++ + Cry 


with c; > 0. We denote this relation by u > À. 
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The notion of an algebraically integral element is essentially the one we used in 
Chapters 8 and 9 in the context of semisimple Lie algebras. Specifically, if g := tc 
is semisimple, then A € t is algebraically integral if and only if 7A is integral in the 
sense of Definition 8.34. We will see in Sect. 12.2 that every algebraically integral 
element is analytically integral, but not vice versa. In Chapter 13, we will show that 
when K is simply connected, the two notions of integrality coincide. 


Proposition 12.5. Let (=, V) be a representation of K and let o be the associated 
representation of £. If A € t is a real weight of o, then À is an analytically integral 
element. 


Proof. If v is a weight vector with weight A and H is an element of I’, then on the 
one hand, 


L(e"#*)\y = hv =v, 
while on the other hand, 


E(e™”)y = ecto) yy E eTii AA) yy 


Thus, e?7'*-4) = 1, which implies that (A, H) € Z. o 


We are now ready to state the theorem of the highest weight for (finite- 
dimensional) representations of a connected compact group. 


Theorem 12.6 (Theorem of the Highest Weight). If K is a connected, compact 
matrix Lie group and T is a maximal torus in K, the following results hold. 


1. Every irreducible representation of K has a highest weight. 

2. Two irreducible representations of K with the same highest weight are 
isomorphic. 

3. The highest weight of each irreducible representation of K is dominant and 
analytically integral. 

4. If u is a dominant, analytically integral element, there exists an irreducible 
representation of K with highest weight u. 


Let us suppose now that g := fc is semisimple. Even in this case, the set of 
analytically integral elements may not coincide with the set of algebraically integral 
elements, as we will see in several examples in Sect. 12.2. Thus, the theorem of 
the highest weight for g (Theorems 9.4 and 9.5) will, in general, have a different 
set of possible highest weights than the theorem of the highest weight for K. This 
discrepancy arises because a representation of g may not come from a representation 
of K, unless K is simply connected. In the simply connected case, there is no 
such discrepancy; according to Corollary 13.20, when K is simply connected, every 
algebraically integral element is analytically integral. 

We will develop the tools for proving Theorem 12.6 in Sects. 12.2—12.4, with the 
proof itself coming in Sect. 12.5. The hard part of the theorem is Point 4; this will 
be established by appealing to a completeness result for characters. 
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12.2 Analytically Integral Elements 


In this section, we establish some elementary properties of the set of analytically 
integral elements (Definition 12.4) and consider several examples. One additional 
key result that we will establish in Chapter 13 is Corollary 13.20, which says that 
when K is simply connected, the set of analytically integral elements coincides with 
the set of algebraically integral elements. 


Proposition 12.7. 


1. The set of analytically integral elements is invariant under the action of the Weyl 
group. 

2. Every analytically integral element is algebraically integral. 

3. Every real root is analytically integral. 


We begin with an important lemma. 


Lemma 12.8. [fa € tis a real root and Hy = 2a/ (a,a) is the associated real 
coroot, we have 


e27 Ha = 


in K. That is to say, Hy belongs to the kernel I of the exponential map. 


Proof. By Corollary 7.20, there is a Lie algebra homomorphism ¢ : su(2) > € 
such that the element iH = diag(i,—i) in Su(2) maps to the real coroot Hy. (In 
the notation of the corollary, H, = 2E‘.) Since SU(2) is simply connected, there 
is (Theorem 5.6) a homomorphism ® : SU(2) —> K for which the associated Lie 
algebra homomorphism is ¢. Now, the element iH € Su(2) satisfies e?”"” = I, and 
thus 


I= P(e?) = 2M $(iH) — 92 Hy 
as claimed. oO 


Proof of Proposition 12.7. For Point 1, we have already shown (following Defi- 
nition 12.3) that I is invariant under the action of W. Thus, if A is analytically 
integral and w is the Weyl group element represented by x, we have (w-A, H) = 
(A, w! . H) € Z, since w™! - H is in T. For Point 2, note that for each œ, we 
have H, € T by Lemma 12.8. Thus, if À is analytically integral, (A, Ha) € Z, 
showing that A is algebraically integral. For Point 3, we note that the real roots are 
real weights of the adjoint representation, which is a representation of the group K, 
and not just its Lie algebra. Thus, by Proposition 12.5, the real roots are analytically 
integral. o 


Proposition 12.9. If A is an analytically integral element, there is a well-defined 
function fy : T — C such that 


fale”) = eV) (12.2) 
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for all H € t. Conversely, for any À € t, if there is a well-defined function on T 
given by the right-hand side of (12.2), then à must be analytically integral. 


Proof. Replacing H by 27H, we can equivalently write (12.2) as 
fle) =e) Het. (12.3) 


Now, since T is connected and commutative, every t € T can be written ast = 
e2"H for some H € t. Furthermore, e2%4+#) = e?7H if and only if e” = I, 
that is, if and only if H’ € T. Thus, the right-hand side of (12.3) defines a function 
on T if and only if e277 ¢-4+4') — 271.) for all H’ € T. This happens if and 
only if e274’) — 1 for all H’ e€ T, that is, if and only if (A, H’) € Z for all 
H' eT. o 


Proposition 12.10. The exponentials fy in (12.2), as à ranges over the set of 
analytically integral elements, are orthonormal with respect to the normalized 
volume form dt on T: 


Í AO fyt) dt = dy. (12.4) 


Proof. Let us identify T with (S')* for some k, so that t is identified with R* and 
the scaled exponential map is given by 


(01, ao., On) > (e278, , e?) 


The kernel T of the exponential is the integer lattice inside R*. The lattice of 
analytically integral elements (points having integer inner product with each element 
of T`) is then also the integer lattice. Thus, the exponentials in the proposition are 
the functions of the form 


fale a. ei) = eit sat el Ak 
with A = (A1,...,Ax) € ZF. 
Meanwhile, if we use the coordinates 64, . . . , 0 on T, then any k-form on T can 
be represented as a density p times d0; ^- -+^ dOn. Since the volume form dt on T is 


translation invariant, the density pọ must be constant. Thus, the normalized integral 
in (12.4) becomes 


| TONO 
T 
2n 2n ; 
= omy | -f eiO o. idk IMM... IAL AO, -- -dO 
0 0 


20 20 
= (2x)* ( f evans, ) e ( I iat) . 
0 0 


The claimed orthonormality then follows from direct computation. o 
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Fig. 12.1 Dominant, analytically integral elements (black dots) and dominant, algebraically 
integral elements (black and white dots) for SO(3) 


We now calculate the algebraically integral and analytically integral elements 
in several examples, with an eye toward clarifying the distinction between the 
two notions. When K is simply connected, Corollary 13.20 shows that the set 
of analytically integral and algebraically integral elements coincide. Thus, in the 
simply connected case, the calculations in Sects. 6.7 and 8.7 provide examples of 
the set of analytically integral elements. We consider now three groups that are not 
simply connected. 


Example 12.11. Consider the group SO(3) and let t be the maximal commutative 
subalgebra spanned by the element F3 in Example 3.27. Let the unique positive root 
a be chosen so that (a, F3} = 1. Then u € tis dominant and algebraically integral 
if and only if u = ma/2, where m is a non-negative integer, and u € t is dominant 
and analytically integral if and only if u = ma, where m is a non-negative integer. 


See Figure 12.1. Note that in this case, 6 (half the sum of the positive roots) is 
equal to w/2, which is not an analytically integral element. 


Proof. Following Sect. 7.7, but adjusting for our current convention of using real 
roots (Definition 11.34), we identify t with R by mapping aF3 to a. The roots are 
then the numbers +1, where we take 1 as our positive root a, so that (a, F3} = 1. 
Then u € t = R is dominant if and only if u > 0. Furthermore, jz is algebraically 
integral if and only if 


2 (u,a) = 2(u)(1) € Z, 


that is, if and only if 2u is an integer. Thus, the dominant, algebraically integral 
elements are the numbers of the form m/2 = ma/2. 

Now, e?%¢"3 = J if and only if a is an integer. Thus, ju is analytically integral if 
and only if (u)(a) € Z for alla € Z, that is, if and only if jz is an integer. Thus, the 
dominant, analytically integral elements are the numbers of the form m = mæ. O 


We consider next the group U(2), with t consisting of diagonal matrices with 
pure imaginary diagonal entries. We identify t with R? by mapping diag(ia, ib) to 
(a,b). The roots are then the elements of the form (1,1) and (—1,—1), and we 
select æ := (1, 1) as our positive root. We now decompose every H eT as a linear 
combination of the vectors 


«= (1,1); g= (1,—1). (12.5) 
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Fig. 12.2 The dominant, analytically integral elements (black dots) and nondominant, analytically 
integral elements (white dots) for U(2). The vertical lines indicate the algebraically integral 
elements 


Example 12.12. Let t be the diagonal subalgebra of u(2), and write every element 
À € tas 


à = ca + dP, 


with a and £ as in (12.5). Then A is analytically integral if and only if either c and d 
are both integers or c and d are both of the form integer plus one-half. Furthermore, 
À is dominant if and only if c > 0. On the other hand, À is algebraically integral if 
and only if c is either an integer or an integer plus one-half. 


In Figure 12.2, the black dots are the dominant, analytically integral elements and 
the white dots are the nondominant, analytically integral elements. All the points in 
the vertical lines are algebraically integral. 


Proof. If H = diag(ia, ib), then e?™'¥ = J if and only if a and b are both integers. 
Thus, when we identify t with R?, the kernel T of the exponential corresponds to the 
integer lattice Z?. The lattice of analytically integral elements is then also identified 
with Z?. Now, it is straightforward to check that ca +d is in Z? if and only if either 
c and d are both integers or c and d are both integers plus one-half, accounting for 
the claimed form of the analytically integral elements. Since 6 is orthogonal to a, 
an element A = cæ + d has non-negative inner product with « if and only if c > 0, 
accounting for the claimed condition for A to be dominant. Finally, A is algebraically 
integral if and only if 2 (A, a) / (a, œ) is an integer, which happens if c is either an 
integer or an integer plus one-half, with no restriction on d. oO 
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Fig. 12.3 Dominant, analytically integral elements (black dots) and dominant, algebraically 
integral elements (black and white dots) for SO(5) 


Example 12.13. The dominant, analytically integral elements for SO(5) are as 
shown in Figure 12.3. 


The figure shows the dominant, analytically integral elements (black dots). The 
white dots are dominant, algebraically integral elements that are not analytically 
integral. The background square lattice is the set of all analytically integral elements. 
Note that in Figure 12.3, the B2 root system is rotated by 2/4 compared to 
Figure 8.11. If we rotate Figure 12.3 clockwise by 2/4 and then reflect across the 
x-axis, the set of dominant algebraically integral elements in Figure 12.3 (black and 
white dots) will match the set of “dominant integral” elements in Figure 8.11. Note 
that ô (half the sum of the positive roots) is not analytically integral. 


Proof. Elements of t are of the form 


Oa 
—a 0 
H= Ob ; (12.6) 
—b 0 
0 


with a,b € R. Following Sect. 7.7, but adjusting for our current convention of using 
real roots, we identify t with R* by means of the map H +> (a,b). The roots are 
then the elements (+1, 0), (0, +1), and (+1, +1). As a base, we take a, = (1, —1) 
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and a = (0, 1). Furthermore, (x, y) is algebraically integral if 


(Gs=D. E) 


=x-yeZ 
2 anes 


and 


ODE) ayer, 
These conditions hold if and only if either x and y are both integers or x and y are 
both of the form “integer plus one-half.” 

Now, if H is as in (12.6), e?7" = J if and only if a and b are integers. Thus, 
under our identification of t with R*, the kernel of the exponential map is the set of 
elements of the form (a, b), with a,b € Z. Thus, (x, y) is analytically integral if 


((a, b), (x, y)) =ax+byeZ 


for all a,b € Z. This condition holds if and only if x and y are both integers. 
Finally, (x, y) is dominant if it has non-negative inner product with each of a 

and a, which happens if (x, y) is in the 45-degree sector indicated by dashed lines 

in Figure 12.3. oO 
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In this section, we show that the characters of irreducible representations form an 
orthonormal set and that these characters are complete in the space of continuous 
class functions on K. 


Definition 12.14. Suppose (II, V) is representation of K. Then the character of 
TI is the function yn : K — C given by 


Xn(x) = trace(I1(x)). 

Note that we now consider the character as a function on the group K, rather than 
on the Lie algebra g = c, as in Chapter 10. If x is the associated representation 
of g, then the character 7, of x (Definition 10.11) is related to the character yy of 
I by 

xne”) = yn (H), Het. 


Note that each character is a class function on K: 


Xn Oxy») = trace(T(y) M(x) T(y)!) = trace(TI(x)). 
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The following theorem says that the characters of irreducible representations form 
an orthonormal set in the space of class functions. 


Theorem 12.15. Jf (II, V) and (%, W ) are irreducible representations of K, then 


1i fV xW 


[ acer Ew) dx = 0 eV ew’ 


where dx is the normalized left-invariant volume form on K. 
If (II, V) is a representation of K, let V* denote the space given by 
VE = {w € V | (x)u = v forall x € K}. 


Lemma 12.16. Suppose (II, V) is a finite-dimensional representation of K, and let 
P be the operator on V given by 


P = | nw dx. 


Then P is a projection onto V£ . That is to say, P maps V into V£ and Pv = v for 
allv e VE. 


Clearly, V* is an invariant subspace for TI. If we pick an inner product on V for 
which TI is unitary, then (V*)+ is also invariant under each TI (x) and thus under P. 
But since P maps into V*, the map P must be zero on (V*)*; thus, P is actually 
the orthogonal projection onto V£. 


Proof. For any y € K and v € V, we have 
I(y)Pv = IM(y) (J. I(x) ax) v 
= (/ Myx) ax) v 
K 


= Py, 


by the left-invariance of the form dx. This shows that Pv belongs to V X. Meanwhile, 
ifv € VX, then 


P= f I(x)v dx 
K 


Ue) 


=v, 


by the normalization of the volume form dx. oO 
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Note that if V is irreducible and nontrivial, then VŽ = {0}. In this case, the 
proposition says that te I(x) dx = 0. 


Lemma 12.17. For A: V —> V and B : W > W, we have 
trace(A)trace(B) = trace(A & B), 


where A@ B:V @W — V ® W is as in Proposition 4.16. 


Proof. If {vj} and {w;} are bases for V and W, respectively, then {v; ® w7} is a 
basis for V @ W. If Aj and Bj, are the matrices of A and B with respect to {v;} 
and {w/}, respectively, then the matrix of A ® B with respect to {v; ® w7} is easily 
seen to be 


(A 8 B)GDikm) = Aj Bim. 
Thus, 


trace(A ® B) = >X Aj By = trace(A)trace(B), 
jl 
as claimed. o 
Proof of Theorem 12.15. We know that there exists an inner product on V for which 
each TI (x) is unitary. Thus, 
trace(II(x)) = trace(II(x)*) = trace(TI(x7!)). (12.7) 


Recall from Sect. 4.3.3 that for any A : V — V, we have the transpose operator 
A” : V* — V*. Since the matrix of A” with respect to the dual of any basis 
{vj} of V is the transpose of the matrix of A with respect to {v;}, we see that 
trace(A”) = trace(A). Thus, 


trace(II(x)) = trace(II(x~!)”) = trace(I1(x)), 
where II” is the dual representation to I]. Thus, the complex conjugate of the 
character of TI is the character of the dual representation IT” of IT. 


Using Lemma 12.17, we then obtain 


trace(I](x))tr((x)) = trace(II (x) @ E (x)) 
= trace((I1” @ ¥)(x)). 
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By Lemma 12.16, this becomes 


J trace(II(x))trace(¥ (x)) dx = 1 trace((O” & £)(x)) dx 
K K 


= trace @ T” ® ¥)(x) ax) 
K 


= trace(P) 


= dim((V* @ W)*) (12.8) 


where P is a projection of V* & W onto (V* @ W)*. 

Now, for any two finite-dimensional vector spaces V and W, there is a natural 
isomorphism between V* @ W and End(V, W), the space of linear maps from V 
to W. This isomorphism is actually an intertwining map of representations, where 
x € K acts on A € End(V, W) by 


x- A= E(x)AN(x)!. 


Finally, under this isomorphism, (V* @ W)* maps to the space of intertwining 
maps of V to W. (See Exercises 3 and 4 for the proofs of the preceding claims.) 
By Schur’s lemma, the space of intertwining maps has dimension 1 if V = W and 
dimension 0 otherwise. Thus, (12.8) reduces to the claimed result. oO 


Our next result says that the characters form a complete orthonormal set in the 
space of class functions on K. 


Theorem 12.18. Suppose f is a continuous class function on K and that for every 
finite-dimensional, irreducible representation TI of K, the function f is orthogonal 
to the character of I1: 


f Ff (x)trace(II(x)) dx = 0. 
K 


Then f is identically zero. 


The proof given in this section assumes in an essential way that K is a compact 
matrix Lie group. (Of course, we work throughout the book with matrix Lie groups, 
but most of the proofs we give extend with minor modifications to arbitrary Lie 
groups.) In Appendix D, we sketch a proof of Theorem 12.18 that does not rely on 
the assumption that K is a matrix group. That proof, however, requires a bit more 
functional analysis than the proof given in this section. 

We now consider a class of functions called matrix entries, that include as a 
special case the characters of representations. We will prove a completeness result 
for matrix entries and then specialize this result to class functions in order to obtain 
completeness for characters. 
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Definition 12.19. If (II, V) is a representation of K and {v;} is a basis for V, the 
functions f : K —> C of the form 


f(x) = (T1(x))jx (12.9) 


are called matrix entries for II. Here (I1(x))j;, denotes the (j, k) entry of the matrix 
of I1(x) in the basis {v;}. 

In a slight abuse of notation, we will also call f a matrix entry for I if f is 
expressible as a linear combination of the functions in (12.9): 


F(x) = Do cee) ie. (12.10) 
jk 
We may write functions of the form (12.10) in a basis-independent way as 


f(x) = trace(I1(x)A), 


where A is the operator on V whose matrix in the basis {v;} is Aj = cy. (The 
sum over k computes the product of TI (x) with A and the sum over j computes the 
trace.) Note that if A = Z then f is the character of IT. 


Lemma 12.20. If (II, V) is an irreducible representation of K, then for each 
operator A on V, we have 


/ T(y) | AM (y) dy = cl (12.11) 
K 


for some constant c. 


Proof. Let B denote the operator on the left-hand side of (12.11). By the left- 
invariance of the integral, we have 


T(x)! BI (x) =] TI (yx) AI] (yx) | dy 
K 


= | TATO dy 
K 
=B. 
Thus, B commutes with each T(x), which, by Schur’s lemma, implies that B is a 
multiple of the identity. o 
We are now ready for the proof of our completeness result for characters. 


Proof of Theorem 12.18. Let A denote the space of continuous functions on K 
that can be expressed as a linear combination of matrix entries, for some finite 
collection of representations of K. We claim that A satisfies the hypotheses of the 
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complex version of the Stone—Weierstrass theorem, namely, that is an algebra, that it 
vanishes nowhere, that it is closed under complex conjugation, and that it separates 
points. (See Theorem 7.33 in [Rud1].) First, using Lemma 12.17, we see that the 
product of two matrix entries is a matrix entry for the tensor product representation, 
which decomposes as a direct sum of irreducibles. Thus, the product of two matrix 
entries is expressible as a linear combination of matrix entries, showing that A 
is an algebra. Second, the matrix entries of the trivial representation are nonzero 
constants, showing that A is nowhere vanishing. Third, by a simple extension 
of (12.7), we can show that the complex conjugate of a matrix entry is a matrix 
entry for the dual representation. Last, since K is a matrix Lie group, it has, by 
definition, a faithful finite-dimensional representation. The matrix entries of any 
such representation separate points in K. 

Thus, the complex version of the Stone—Weierstrass theorem applies to A, 
meaning that if f is continuous, we can find g € A such that g is everywhere 
within £ of f. If f is a class function, then for all x, y € K, we have 


LŒ — goxy™")| = |07) — g0xy7")| < e. 


Thus, if h is given by 


nosy = f gon) dy, 


then h will also be everywhere within € of f. 
Now, by assumption, g can be represented as 


g(x) = 5 trace(II;(x)A;) 
J 


for some family of representations (I1;,V;) of K and some operators A; € 
End(V;). We can then easily compute that 


h(x) = Xo trace(I1;(x)B;), 
J 


where 
B; = f TOATO) ay. 


But by Lemma 12.20, each B; is a multiple of the identity, which means that h is a 
linear combination of characters. 

We conclude, then, that every continuous class function f can be uniformly 
approximated by a sequence of functions h,, where each h, is a linear combinations 
of characters. If f is orthogonal to every character, f is orthogonal to each h,. 


12.4 The Analytic Proof of the Weyl Character Formula 357 


Then, by letting n tend to infinity, we find that f is orthogonal to itself, meaning 
that fy | f(x)|? dx = 0. Since f is continuous, this can only happen if f is 
identically zero. o 


We used the assumption that K is a matrix Lie group to show that the algebra 
A separates points in K, which allowed us to prove (using the Stone—Weierstrass 
theorem) that A is dense in the space of continuous functions on K. In Appendix D, 
we sketch a proof of Theorem 12.18 that does not assume ahead of time that K is a 
matrix group. 


12.4 The Analytic Proof of the Weyl Character Formula 


In order to simplify certain parts of the analysis, we make the following temporary 
assumption concerning 6 (half the sum of the real, positive roots). 


Assumption 12.21 Jn this section and in Sect. 12.5, we assume that the element 6 
is analytically integral. 


As we will show in Chapter 13 (Corollary 13.20), 6 is analytically integral 
whenever K is simply connected. In Sect. 12.6, we describe the modifications 
needed to the arguments when ô is not analytically integral. We will give, in this 
section, an an analytic proof of the Weyl character formula, as an alternative to the 
algebraic argument in Sect. 10.8. The proof is based on a more-explicit version of the 
Wey] integral formula, obtained by computing the weight function p(t) occurring in 
Theorem 11.30. We will see that the function p is the square of another function, 
which turns out to be our old friend, the Weyl denominator. This computation will 
provide a crucial link between the Weyl integral formula and the Weyl character 
formula. 

If 5 denotes half the sum of the positive real roots, then the Weyl denominator 
function (Definition 10.15) takes the form 


q(H) = X` dewei "3, Het. (12.12) 
wew 


By Assumption 12.21 and Proposition 12.9, each exponential e/("5”) w e€ W, 
defines a function on T. Thus, there is a unique function Q : T —> C satisfying 


O(e")=q(A), Het. (12.13) 


We now state Weyl’s formula for the character (Definition 12.14) yn of I. 


Theorem 12.22 (Weyl Character Formula). Suppose (TI, V) is an irreducible 
representation of K and that u is a maximal weight for II. Then u is dominant 
and analytically integral, and u is actually the highest weight of TI. Furthermore, 
the character of TI is given by the formula 
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Sre y det(w)e! (w (u+8).H) 
q(H) 


an(e”) = (12.14) 


H 


at all points e” € T for which q(H) # 0. In particular, every irreducible 
representation has a highest weight that is dominant and analytically integral. 


The character formula amounts to saying that O(t)yn(¢) is an alternating 
sum of exponentials. The right-hand side of (12.14) is the same expression as in 
Theorem 10.14, adjusted for our current convention of using real roots and real 
weights. 


Example 12.23. Let K = SU(2), let T be the diagonal subgroup, and let Im be 
the irreducible representation for which the largest eigenvalue of ,,(H) is m. Then 
the character formula for II, takes the form 


el? 0 _ sin((m + 1)0) 
Xin Oe? J} sin 0 ` 


This formula may also be obtained from Example 10.13 with a = ið. 


Proof. If H = diag(1, —1), then we may choose the unique positive root « to satisfy 
(a, H) = 2. The highest weight u of a representation then satisfies (u, H) = m. 
Note that diag(e’®, e~'°) = e°, that 5 satisfies (8, H) = (a/2, H} = 1, and that 
W = {1,—1}. Thus, the character formula reads 


ei (mt 16 _ e` (m+1)0 


xn, (diag(e’®, e"")) = eid — eid 


which simplifies to the claimed expression. o 


The hard part of the proof of the character formula is showing that in the product 
q(H)xn(e” ) of the Weyl denominator and a character, no other exponentials occur 
besides the ones of the form ef (w (u+8).H) w € W. Now, Theorem 12.15 tells us that 
the norm of 717 is 1: 


f lan(x)|? dx=1. (12.15) 
K 


In the analytic approach, the unwanted exponentials will be ruled out by applying 
the Weyl integral formula to show that if any other exponentials did occur, the 
integral on the left-hand side of (12.15) would be greater than 1. To make this 
argument work, we must work out the Weyl integral formula in a more explicit form. 


Proposition 12.24. If ô is analytically integral, the Weyl integral formula takes the 
form 


= 1l 2 =i 
[ fea = 7 fio [fo jadiy] dt, (12.16) 
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where dx,d|y], and dt are the normalized volume forms on K,K/T, and T, 
respectively, and where Q is as in (12.13). 


In the proof of this proposition, we also verify the correctness of the normal- 
ization constant in the Weyl integral formula (Theorem 11.30), a point that was 
not addressed in Sect. 11.6. (We will address the normalization issue when 6 is not 
integral in Sect. 12.6.) As we have already indicated above, knowing the correct 
normalization constant is an essential part of our analytic proof of the Weyl integral 
formula. 


Proof. We may extend Ad,-: to a complex-linear operator on g = fc, where g 
is the direct sum of h := tc and fc, where fc is the orthogonal complement of 
h in g. Meanwhile, g also decomposes as the direct sum of h and the root spaces 
Ga. We now claim that each gy is orthogonal to h. To see this, choose H € t for 
which (a, H) Æ 0. Then each H’ € b is an eigenvector for ady with eigenvalue 0, 
whereas X € gq is an eigenvector for ady with a nonzero eigenvalue. Since ady is 
skew self-adjoint, H’ and X must be orthogonal. Thus, actually, fc is actually the 
direct sum of the ggs. 
Now, if w is areal root and X belongs to ga, we have 


Ad,-1 (X) = edu (X) = ela) y 


for X € ga. Thus, letting Ad’ -y denote the restriction of Ad,—# to fc, we have 


lle a 1) 


«ER 


I] (ei (2-H) _ 1)(e! _ 1). 


aeRT 


det(Ad’_, — I) 


II 


Since (e~'° — 1) (et? — 1) = |e? — evel , we have 


2 


I] (ei (%2 gis) 
aeRt 


\q(H)I’. 


det(Ad,-» — 1) 


Thus, the Wey] integral formula (Theorem 11.30) takes the claimed form. 

It remains only to verify the correct normalization constant in the Wey] integral 
formula. To see that (12.16) is properly normalized, it suffices to consider f = 1, 
in which case (12.16) says that 


= 1 2 
l= wf 120! dt, (12.17) 
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where dt is the normalized volume form on T. Now, by Proposition 8.38, 6 belongs 
to the open fundamental Weyl chamber, which means (Proposition 8.27) that the 
exponentials el (wi l, w € W, are all distinct. Proposition 12.10 then tells us that 
these exponentials are orthonormal, so that the integral of OK over T equals 
|W], verifying (12.17). o 


We are now ready for our analytic proof of the Wey] character formula. 


Proof of Theorem 12.22. Since II is finite dimensional, it has only finitely many 
weights. Thus, there is a maximal weight u (i.e., one such that no other weight of 
II is higher than u), with multiplicity m, > 1. We then claim that in the product 
q(H)yn(e”), the exponential e+”) occurs only once, by taking the term e! (6) 
from q and the term m yee ) from zn. To see this, suppose we take e! "6-7 
q and m,et ®-Ħ) from yn, and that 


from 


Atw-d=p+6, 
so that 
A="wt(d—w-d). 


But by Proposition 8.42, 6 > w -ô and, thus, A > pu. Since u is maximal, we 
conclude that A = jz, in which case we must also have w = J. 

We conclude, then, that ei +84) occurs with multiplicity exactly m n in the prod- 
uct. Now, 7n(e”) is a Weyl-invariant function, while q (H) is Weyl-alternating. The 
product of these two functions is then Wey] alternating. Thus, by Corollary 10.17, 
each of the exponentials e/(”"4+5)-4) occurs with multiplicity det(w)m,. 

We now claim that m, = 1 and that the only exponentials in the product q yn are 
those of the form det(w)e’""““+9)-4)_ To see this, recall from Theorem 12.15 that 


J InP dx = 1. 
K 


Thus, Proposition 12.24 (as applied to a class function), we have 


J eono? dt = |W], (12.18) 


where |W | is the order of the Weyl group. 

Now, the product Q yn is a sum of exponentials and those exponentials are 
orthonormal on T (Proposition 12.10). Since at least |W | exponentials do occur 
in the product, namely those of the form det(w)e’(”“+5)-) | these exponentials 
make a contribution of m, |W | to the integral in (12.18). By orthonormality, any 
remaining exponentials would make a positive contribution to the integral. Thus, 
the only way (12.18) can hold is if m, = 1 and there are no exponentials in Q yn 
other than those of the form det(w)e!(”“+9)"), Thus, after dividing by Q(t), we 
obtain the Wey] character formula. 
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Since jz is a weight of IT, it must be analytically integral. It remains to show 
that u is dominant and that u is the highest weight of IT. If u were not dominant, 
there would be some w e W for which w- u is dominant, in which case (by 
Proposition 8.42), 4 = w ! + (w- u) would be lower than w - jz, contradicting 
the maximality of u. Meanwhile, if u were not the highest weight of IT, we could 
choose a maximal element jz’ among the weights of TI that are not lower than jz, and 
this u’ would be another maximal weight for IT. As we have just shown, u’ would 
also have to be dominant, in which case (Proposition 8.29), the Weyl-orbit of ju’ 
would be disjoint from the Weyl]-orbit of u. But then by the reasoning above, both 
the exponentials e/(""’+9)-4) and the exponentials e'("”““+)-4) would occur with 
nonzero multiplicity in the product Q yn, which would force the integral in (12.18) 
to be larger than |W]. Oo 


In Chapter 10, we used the Casimir element and the character formula for 
Verma modules to show that any exponential e'-”) that appears in the product 
q(H)x,(H) must satisfy |A| = |u + ô|. This was the key step in proving that 
only exponentials of the form A = w- (u + ô) appear in qx”. In this chapter, 
we have instead used the Weyl integral formula to show that if any unwanted 
exponentials appeared, the integral of | xn E over K would be greater than 1, which 


would contradict Theorem 12.15. 


Proposition 12.25. Two irreducible representations of K with the same highest 
weight are isomorphic. 


Proof. If II and & both have highest weight u, then the Weyl character formula 
shows that the characters yn and yn must be equal. But if II and X were not iso- 
morphic, Theorem 12.15 would imply that yn and yy = xn are orthogonal. Thus, 
Xn would have to be identically zero, which would contradict the normalization 
result in Theorem 12.15. o 


12.5 Constructing the Representations 


In our proof of Theorem 12.6, we showed as part of the Weyl character formula 
that every irreducible representation of K has a highest weight, and that this 
highest weight is dominant and analytically integral. We also showed in Proposi- 
tion 12.25 that two irreducible representations of K with the same highest weight 
are isomorphic. Thus, in proving Theorem 12.6, it remains only to prove that 
every dominant, analytically integral element arises as the highest weight of a 
representation. We continue to assume that 6 (half the sum of the real, positive roots) 
is an analytically integral element. This assumption is lifted in Sect. 12.6. 

Suppose now that u is a dominant, analytically integral element. We do not 
yet know that u is the highest weight of an irreducible representation of K. 
Nevertheless, as we now demonstrate, there is a well-defined function ¢, on K 
whose restriction to T is given by the right-hand side of the Wey] character formula. 
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Lemma 12.26. For each dominant, analytically integral element u, there is a 
unique continuous function $, : T — C satisfying 


X wew det(w)e! ™(u+8).H) 
q(H) 


pule”) = , Het, (12.19) 


whenever q(H) # 0. 


In preparation for the proof of this lemma, recall from Lemma 10.28 (adjusted 
for using real roots) that the Weyl denominator can be written as 


q(H):= $ detwi’) = (2i)* || sin((w, H) /2) 


wew aeRt 


where k is the number of positive roots. For each œ € R and each integer n, define 
a hyperplane (not necessarily through the origin) by 


Van = {H € t|(&, H) = 27an}. (12.20) 


Then the zero-set of the Weyl denominator q is the union of all the Vg n’s, as œ ranges 
over the set of positive real roots and n ranges over the integers. Furthermore, q has 
a “simple zero” on each of these hyperplanes. Symmetry properties of the numerator 
on the right-hand side of (12.19) will then force this function to be zero each Vy, 
so that the zeros in the numerator cancel the zeros in the denominator. 


Proof. By Assumption 12.21, the element 6 is analytically integral, which means 
that each of w- (u + ô) and w-6,w € W, is also analytically integral. Thus, by 
Proposition 12.9, all of the exponentials involved are well-defined functions on T. 
It follows that the right-hand side of (12.19) is a well-defined function T, outside 
of the set where the denominator is zero. We now argue that the right-hand side 
of (12.19) extends uniquely to a continuous function on t. 

Let Y, (H), H € t, denote the numerator on the right-hand side of (12.19): 


Y, (H) = > det(w)e! W U+8-H), 


wEW 


We claim that Y, vanishes on each of the hyperplanes Van in (12.20). To verify this 
claim, note that each weight A = w- (u +ô) occurring in y, is analytically integral, 
and thus also algebraically integral, by Proposition 12.7. Thus, (A, Ha) is an integer 
for each a € R. It follows that each exponential in y,,—and, thus, Y, itself—is 
invariant under translation by 27 Ha: 


Wu(H + 27H) = Yu (B) 


forall H € tandalla € R. 
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Meanwhile, the coefficients of det(w) in the formula for Y, guarantee that y, is 


alternating with respect to the action of W. Finally, if H € Vy, we have 


(a, H) 
Sy:H=H-—-2 a= H —2nnHy. 
(a, a) 


Putting these observations together, we have, for H € Van, 


Val H) = —WulSa: H) = -Ypa (A — 20nHg) = -Ya (A), 


which can happen only if Y„ (H) = 0. (See Figure 12.4, where the gray lines are 
the hyperplanes Vy „ with |n| < 2. Compare also Exercise 6.) 


Now, for each aw and n, let Ly, : t > R be given by 


Lan(A) = (a, H) — 27n, 


so that the zero-set of Lg n is precisely Vy n. Then the function 


Lun (H) 
; (12.21) 
sin((a, H) /2) 
extends to a continuous function on a neighborhood of Vy n in t. On the other hand, 
we have shown that y, vanishes on Vg,„ for each «œ and n. Furthermore, since y, is 
given by a globally convergent power series, it is not hard to prove that y, can be 


Fig. 12.4 The function y, changes sign under the reflection s, and is unchanged under translation 
by 2x Ha, forcing y, to be zero on each Vy, 
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expressed as 


Wu) = f(A) Lan(H) 


for some smooth function f, where f is also given by a globally convergent power 
series and where f still vanishes on each hyperplane Vg with (B,m) # (a,n). 
(See Exercise 7.) 

Now let H be an arbitrary point in t. Since the hyperplanes Vy „ are disjoint for a 
fixed and n varying, H is contained in only finitely many of these hyperplanes, say 
Vaini» -+ -s Vam nm- Since Yp vanishes on each of these hyperplanes, we can show 
inductively that Y, can be expressed as 


Wu = Larni KS Lam nm g (12.22) 


for some smooth function g. Since the function in (12.21) is nonsingular in a 
neighborhood of Vy n, we conclude that the factors of La; nj in (12.22) cancel all 
the zeros in g(H), showing that ¢,,(e”) = W,(H)/q(H) extends to a continuous 
function in a neighborhood of H. 

Finally, in local exponential coordinates on T, each point H is contained in at 
most finitely many of the hyperplanes on which q vanishes. Thus, H is a limit of 
points on which q ¥ 0, showing the uniqueness of the continuous extension. o 


Lemma 12.27. Let p, be as in Lemma 12.26 and let ®, : K — C be the unique 
continuous class function on K such that ®, le = , (Corollary 11.40). Then as 
ranges over the set of dominant, analytically integral elements, the functions ®,, 
form an orthonormal set: 


J D(x) Oy (x) dx = pw. 
K 

Note that we do not know, at the moment, that each ®,, is actually the character 
of a representation of K. Thus, we cannot appeal to Theorem 12.15. 


Proof. Since the denominator in the definition of ¢, is the Weyl denominator, 
Corollary 11.32 to the Wey] integral formula tells us that 


f D(X) B(x) dx 
K 


1 p 
= Í LOU)? Obw E) at 
= Bez i (det(w)eiw+8).#)) (det(w/jei ww +8)-#)) dH. 
iW] A 


wEW wEW 


(12.23) 


Now, since u + ô and yu’ + 6 are strictly dominant, W acts freely on these elements. 
If u ~ p', the W-orbit of u + 6 will be disjoint from the W -orbit of u’ + 6. Thus, 
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the exponentials occurring in ¢, will be disjoint from those in ¢,, which means, 
by Proposition 12.10, that (12.23) is zero. If, on the other hand, u = ju’, we have 
the norm-squared of |W | distinct, orthonormal exponentials, so that the right-hand 
side of (12.23) reduces to 1. oO 


We are now ready to complete the proof of Theorem 12.6, in the case in which the 
element 6 is analytically integral. It remains only to prove that for each dominant, 
analytically integral element m, there is an irreducible representation of K with 
highest weight u. 


Proof of Theorem 12.6. Here is what we know so far about the characters of irre- 
ducible representations. First, by Theorem 12.22, the character of each irreducible 
representation IT must be equal to ®,,, where ju is the highest weight of IT. Second, 
by Lemma 12.26, all the ©,,’s, with y dominant and analytically integral—whether 
or not they are characters of a representation—form an orthonormal set. Last, by 
Theorem 12.18, the characters of the irreducible representations form a complete 
orthonormal set. 

Let uo be a fixed dominant, analytically integral element and suppose, toward 
a contradiction, that there did not exist an irreducible representation with highest 
weight uo. Then the function ®,„, would be a continuous class function on K and 
Po would be orthogonal to the character of every irreducible representation. (After 
all, every irreducible character is of the form ®,, where u is the highest weight 
of the representation, and we are assuming that jlo is not the highest weight of 
any representation.) But Theorem 12.18 says that a continuous class function that 
is orthogonal to every irreducible character must be identically zero. Thus, ®,,, 


would have to be the zero function, which is impossible, since K integrates to 
1 over K. oO 


We may put the argument in a different way as follows. Theorem 12.18 says 
that the characters of irreducible representations form a complete orthonormal set 
inside the space of continuous class functions on K. The Weyl character formula, 
meanwhile, says that the characters form a subset of the set of ®,,’s. Finally, the 
Weyl integral formula says that the collection of all ©,’s, with u dominant and 
analytically integral, are orthonormal. If there were some such yz for which ®, was 
not a character, then the set of characters would be a proper subset of an orthonormal 
set, in which case, the characters could not be complete. 

Now, the preceding proof of the theorem of the highest weight is not very 
constructive, in contrast to the Lie-algebraic proof, in which we gave a direct 
construction of each finite-dimensional representation as a quotient of a Verma 
module. If one looks carefully at the proof of Theorem 12.18, however, one sees 
a hint of a more direct description of the representations from the compact group 
perspective. Specifically, let C(K) denote the space of continuous functions on K, 
and define a left and a right action of K on C(X) as follows. For each x € K, define 
Ly and R, as operators on C(X) as by 


(Lx f)(y) = fay) 
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(Rx f(y) = fay). 


One may easily check that Lwy = L,L, and Ry = RxR». Thus, both L. and R. 
define representations of K acting on the infinite-dimensional space C(K). 

We now show that each irreducible representation of K occurs as a finite- 
dimensional subspace of C (K) that is invariant under the right action of K. For each 
representation irreducible (II, V) of K, we can fix some nonzero vector vo € V, 
which might, for example, be a highest weight vector. Then for each v € V, we can 
consider the function fy : K — C given by 


f(x) = (vo, TI(x)v) . (12.24) 


(The function f, is a special sort of matrix entry for TI, in the sense of the proof 
of Theorem 12.18.) It is not hard to check that the map v > fy is injective; see 
Exercise 8. 

Let W denote the space of all functions of the form f,, with v € V. We can 
compute that 


(Rx fo) O) = (vo, W(yx)v) = (vo, M(y)C1(x)v)) , 


which means that 


Rx fo = fow- 


Thus, W is a finite-dimensional invariant subspace for the right action of K on 
C(K). Indeed, the map v b> fo is a bijective intertwining map between V and W. 


Conclusion 12.28. For each irreducible representation (II, V) of K, fix a nonzero 
vector vo € V and let W denote the subspace of C (X) consisting of functions of the 
form (12.24), with v € V. Then W is invariant under the right action of K and is 
isomorphic, as a representation of K, to V. 


One can pursue this line of analysis further by choosing vp to be a highest 
weight vector and then attempting to describe the space W—without referring to 
the representation [I—by means of its behavior under certain differential operators 
on K. See Sect. 4.12 of [DK] for more information. 


12.6 The Case in Which ô is Not Analytically Integral 


For a general connected compact group K, the element ô may not be analytically 
integral. (See Sect. 12.2.) If 6 is not analytically integral, many of the functions 
we have been working with will not be well-defined functions on T. Specifically, 
exponentials of the form e 8#) and ei W U+8.-Ħ) with w € W and u a dominant, 
analytically integral element, no longer define functions on T. Fortunately, all of our 
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calculations involve products of such exponentials, and these products turn out to be 
well-defined functions on T. Thus, all the arguments in the previous three sections 
will go through with minor modifications. 

In all cases, the quantity 25 = ` „eg+ & is analytically integral, by Point 3 of 
Proposition 12.7. Thus, for each H € t for which e?”” = J, the quantity (5, H) 
must either be an integer or a half integer. If 6 is not analytically integral, there must 
exist some H € t with e™” = J for which (8, H) is a half integer (but not an 
integer). With this observation in mind, we make the following definition. 


Definition 12.29. If ô is not analytically integral, we say that A € t is half integral 
if A — ô is analytically integral. 


That is to say, the half integral elements are those of the form A = 6 + A’, with 
XM being analytically integral. 


Proposition 12.30. [fA and n are half integral, then 4 + n is analytically integral. 
Tf à is half integral, then —A is also half integral and w - À is half integral for all 
we W. 


Proof. If à = ô+ à’ and n = 6 +77 are half integral, then A + n = 26 +A’ + 7’ is 
analytically integral. If A = 6 + A’ is half integral, so is 


—À = —8- V = ĝ— 28- N. 


For each w € W, the set w- R* will consist of a certain subset S of the positive 
roots, together with the negatives of the roots in R* \ S. Thus, w- 6 will consist of 
half the sum of the elements of S minus half the sum of the elements of Rt \ S. It 
follows that 


b-ws= J a, 


aeRt\S 


showing that w-6 is again half integral. (Recall that each root is analytically integral, 
by Proposition 12.7.) More generally, if A = 6 + A’ is half integral, so is w- À = 
w-d+w-d. Oo 


Note that exponentials of the form e/“:”), with A being half integral, do not 


descend to functions on H. Our next result says that, nevertheless, the product 
of two such exponentials (possibly conjugated) does descend to T. Furthermore, 
such exponentials are still “orthonormal on T,’ as in Proposition 12.10 in the 
integral case. 


Proposition 12.31. [fA and n are half integral, there is a well-defined function f 
on T such that 


fle”) = ei (tA) ei 0H) 
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I ei Q.H) ei (nA) JH = bnn 
T 


Proof. If A and ņ are half integral, then —A is half integral, so that 1 — A is 
analytically integral. Thus, by Proposition 12.9, there is a well-defined function 
f : T —> C satisfying 


f(e) = eA ei (nH) — eiln=-H), 
Furthermore, if A = 8 + A’ and n = 6 + 7f are half integral, we have 
[amenn dH = / et (8+A4/.H) oi (5-1! H) dH 
T T 


= f tan ttran dH 
T 
= ys, 


by Proposition 12.10. Since A = ņ if and only if 4’ = 6’, we have the desired 
“orthonormality” result. o 


We now discuss how the results of Sects. 11.6, 12.4, and 12.5 should be modified 
when ô is not analytically integral. In the case of the Weyl integral formula 
(Theorem 11.30), the g(H), H € t, does not descend to T when 4 is not integral. 
That is to say, there is no function Q(t) on T such that O(e”) = q(H). 
Nevertheless, the function laH)? does descend to T, since \q(H)/° is a sum of 
products of half integral exponentials. The Weyl integral formula, with the same 
proof, then holds even if 6 is not analytically integral, provided that the expression 
LOP is interpreted as the function e” > |q(H yr . We may then consider the 
case f = 1 and use Proposition 12.31 to verify the correctness of the normalization 
in the Wey] integral formula. 

In the case of the Weyl character formula, we claim that the right-hand side 
of (12.14) descends to a function on T. To see this, note that we can pull a factor 
of ei(®Ħ) out of each exponential in the numerator and each exponential in the 
denominator. After canceling these factors, we are left with exponentials in both 
the numerator and denominator that descend to T. Meanwhile, in the proof of 
the character formula, although the function Q(t)yq(¢) is not well defined on T, 
the function |Q (t) yn(t)|? is well defined. The Weyl integral formula (interpreted 
as in the previous paragraph) tells us that the integral of | Q(t) yn(t)|’ over T is 
equal to |W |, as in the case where ô is analytically integral. If we then apply the 
orthonormality result in Proposition 12.31, we see that, just as in the integral case, 
the only exponentials present in the product q (Ħ ) yn (e” ) are those in the numerator 
of the character formula. 
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Finally, we consider the proof that every dominant, analytically integral element 
is the highest weight of a representation. If 6 is not analytically integral, then neither 
the numerator nor the denominator on the right-hand side of (12.19) descends to 
function on T. Nevertheless, the ratio of these functions does descend to T, by the 
argument in the preceding paragraph. The argument that ¢,, extends to a continuous 
function on T then goes through without change. (This argument requires only that 
each weight A = w- (u + ô) in Y, be algebraically integral, which holds even 
if 6 is not analytically integral, by Proposition 8.38.) Thus, we may apply the half- 
integral version of the Weyl integral formula to show that the functions ®, on K are 
orthonormal, as u ranges over the set of dominant, analytically integral elements. 
The rest of the argument then proceeds without change. 


12.7 Exercises 


1. Let K = SU(2) and let t be the diagonal subalgebra of Su(2). Prove directly that 
every algebraically integral element is analytically integral. 

Note: Since SU(2) is simply connected, this claim also follows from the general 
result in Corollary 13.20. 

2. This exercise asks you to use the theory of Fourier series to give a direct proof of 
the completeness result for characters (Theorem 12.18), in the case K = SU(2). 
To this end, suppose f is a continuous class function on SU(2) that f is 
orthogonal to the character of every representation. 


(a) Using the explicit form of the Weyl integral formula for SU(2) (Exam- 
ple 11.33) and the explicit form of the characters for SU(2) (Example 12.23), 
show that 


i f(diag(ei? , e™i®))(sin 0) sin((m + 1)0) dd = 0 


=x 


for every non-negative integer m. 

(b) Show that the function 6 +> f(diag(eŻ?, e7Ż®))(sin 0) is an odd function of 
0. 

(c) Using standard results from the theory of Fourier series, conclude that f 
must be identically zero. 


3. Suppose (II,V) and (X,W) are representations of a group G, and let 
Hom(V, W) denote the space of all linear maps from V to W. Let G act on 
Hom(V, W) by 


g-A=X(g)AI(g) |, (12.25) 


for all g € G and A € Hom(W, V). Show that A is an intertwining map of V to 
W if and only if g- A = A forall g € G. 
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4. If V and W are finite-dimensional vector spaces, let ® : V* & W — Hom(V, W) 
be the unique linear map such that for all E£ € V* and w € W, we have 


P(E 8 w)(v) = E(v)w. 


(a) Show that ® is an isomorphism. 

(b) Let (II, V) and (£, W) be representations of a group G, let G act on V* as 
in Sect. 4.3.3, and let G act on Hom(V, W) as in (12.25). Show that the map 
®: V* & W = Hom(V, W) in Part (a) is an intertwining map. 


5. Suppose f(x) := trace(II(x)A) and g(x) := trace(=(x)B) are matrix entries 
for nonisomorphic, irreducible representations I and & of K (Definition 12.19). 
Show that f and g are orthogonal: 


f Td 
K 


Hint: Imitate the proof of Theorem 12.15. 
6. Let {Vy n} denote the collection of hyperplanes in (12.20). If Hy = 2a/ (a, œ) is 
the real coroot associated to a real root a, show that 


Van + mmHy = an+m: 


7. Let V be a hyperplane in R”, not necessarily through the origin, and let L : R” > 
R be an affine function whose zero-set is precisely V. Suppose g : R” — C is 
given by a globally convergent power series in n variables and that g vanishes 
on V. 


(a) Show that g can be expressed as g = Lh for some function h, where h is 
also given by a globally convergent power series. 
Hint: Choose a coordinate system y,,..., Yn on R” with origin in L such 
that L(y) = yı. 

(b) Suppose g also vanishes on some hyperplane V’ distinct from V. Show that 
the function h in Part (a) vanishes on V”. 


8. Let (II, V) be an irreducible representation of K and let vp be a nonzero element 
of V. For each v € V, let f, be the function given in (12.24). Show that if f, is 
the zero function, then v = 0. 

9. Let K = U(n) and let T be the diagonal subgroup of K. Show that the density 
p(-) in the Weyl integral formula (Theorem 11.30) can be computed explicitly as 


pei... en) = I] eit — ef)”, 


l<j<k<n 


Hint: Use Proposition 12.24, as interpreted in Sect. 12.6 in the case where 6 is 
not necessarily analytically integral. 


Chapter 13 
Fundamental Groups of Compact Lie Groups 


13.1 The Fundamental Group 


In this section, we briefly review the notion of the fundamental group of a 
topological space. For a more detailed treatment, the reader should consult any 
standard book on algebraic topology, such as [Hat, Chapter 1]. Let X be any 
path-connected Hausdorff topological space and let xo be a fixed point in X (the 
“basepoint”). We consider loops in X based at xo (i.e., continuous maps / : [0, 1] > 
X with the property that /(0) = /(1) = x). The choice of the basepoint makes no 
substantive difference to the constructions that follow. From now on, “based loop” 
will mean “loop based at xo.” Ultimately, we are interested in the case that X is a 
matrix Lie group. 

If J; and l are two based loops, then we define the concatenation of /; and /, to 
be the loop /; - h given by 


lı (2t), 0 


MOURI o= 


IA IA 
IA IA 
= n= 


> 


that is, /; - l2 traverses /, as t goes from 0 to 1/2 and then traverses l) as £ goes from 
1/2 to 1. 

Two based loops /; and l) are said to be homotopic if one can be “continuously 
deformed” into the other. More precisely, this means that there exists a continuous 
map A : [0,1] x [0,1] —> X such that A(O,t) = h(t) and A(1,t) = h(t) for all 
t € [0, 1] and such that A(s,0) = A(s, 1) = xo for all s € [0, 1]. One should think 
of A(s,t) as a family of loops parameterized by s. In some cases, we may use the 
notation /,(t) in place of A(s, t) to emphasize this point of view. 

A based loop is said to be null homotopic if it is homotopic to the constant loop 
(i.e., the loop /° for which /°(t) = xo for all t € [0, 1]). If all loops in X based at xo 
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are null homotopic, then X is said to be simply connected. Since we are assuming 
X is path connected, it is not hard to show that if all loops at one basepoint are 
null homotopic, the same it true for every other basepoint. Furthermore, if a loop 
based at xo can be shrunk to a point without fixing the basepoint (i.e., requiring only 
that A(s,0) = A(s, 1)), then it can also be shrunk to a point with basepoint fixed 
(i.e., requiring A(s,0) = A(s, 1) = xo). 

The notion of homotopy is an equivalence relation on loops based at xo. 
The homotopy class of a loop / is then the set of all loops that are homotopic 
to /, and each loop belongs to one and only one homotopy class. The concatenation 
operation “respects homotopy,” meaning that if /; is homotopic to l} and mı is 
homotopic to m2, then /; -mı is homotopic to l2 - m2. As a result, it makes sense to 
define the concatenation operation on equivalence classes. 

The operation of concatenation makes the set of homotopy classes of loops based 
at Xo into a group, called the fundamental group of X and denoted 7; (X). To verify 
associativity, we note that although (/,-/2)-/3 is not the same as l1- (l2- l3), the second 
of these two loops is a reparameterization of the first, from which it is not hard to see 
that the loops are homotopic. Meanwhile, the identity in 7 (X) is the constant loop 
1°. This is not an identity at the level of loops but is at the level of homotopy classes; 
that is, /-/° and /°-/ are not equal to /, but they are both homotopic to /, since both 
are reparameterizations of /. Finally, for inverses, the inverse to a homotopy class [/] 
is the homotopy class [/’] where /’(t) = /(1 — t). (It is not hard to see that both / - /’ 
and /’-/ are null homotopic.) A topological space X is simply connected precisely 
if its fundamental group is the trivial group. 

Some standard examples of fundamental groups are as follows: R” is simply 
connected for all n, S” is simply connected for n > 2, and the fundamental group 
of S! is isomorphic to Z. 


Definition 13.1. If X and Y are Hausdorff topological space, a continuous map 
x : Y — X isa covering map if (1) z maps Y onto X and (2) for each x € X, 
there is a neighborhood V of x such that z~! (V) is a disjoint union of open sets Uy, 
where the restriction of x to each U, is a homeomorphism of U, onto V. A cover 
of X is a pair (Y, 7), where x : Y — X is a covering map. If (Y, 7) is a cover of X 
and Y is simply connected, then (Y, 7) is a universal cover of X. 

_ If (Y, x) is a cover of X and f : Z — X is a continuous map, then a map 
f: Z — Y isalift of f if f is continuous and x o f = f. 


It is known that every connected manifold (indeed, every reasonably nice 
connected topological space) has a universal cover, and that this universal cover 
is unique up to a “canonical” homeomorphism, that is, one that intertwines the 
covering maps. (See, for example, pp. 63—66 in [Hat].) Thus, we may speak about 
“the” universal cover of any connected manifold. A key property of a covering maps 
x : Y — X is that lifts of reasonable maps into X always exist, as described in the 
next two results. (See Proposition 1.30 in [Hat].) 
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Proposition 13.2 (Path Lifting Property). Suppose (Y, x) is a cover of X and 
that p : [0,1] — X is a (continuous) path with p(0) = x. Then for each 
y € 2 | ({x}), there is a unique lift p of p for which p(0) = y. 


Proposition 13.3 (Homotopy Lifting Property). Suppose that l is a loop in X, 
that a path p in Y is a lift of l, and that l; is a homotopy of l in X with basepoint 
fixed. Then there is a unique lift of l; to a homotopy ps of p in Y with endpoints 
fixed. 


If we can find a universal cover (Y, 7) of a space X, the cover gives a simple 
criterion for determining when a loop in X is null homotopic. 


Corollary 13.4. Suppose that (Y, x) is a universal cover of X, that l is a loop in 
X, and that p is a lift of l to Y. Then l is null homotopic in X if and only if p is a 
loop in Y, that is, if and only if p(1) = p(0). 


Proof. If the lift p of l is a loop, then since Y is simply connected, there is a 
homotopy ps of p to a point with basepoint fixed. Then /; := m o ps is a homotopy 
of / to a point in X. In the other direction, if there is a homotopy /, of / to a point in 
X, then by Proposition 13.3, we can lift this to a homotopy ps with endpoints fixed. 
Now, if /; is the constant loop at xo, then pı, which is a lift of /;, must live entirely 
in a! ({xo}), which is a discrete set. Thus, actually, pı must be constant, and, in 
particular, has equal endpoints. But since p; is a homotopy with endpoints fixed, the 
endpoints of each p, must be equal. Thus, p = po must be a loop in Y. oO 
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In this section, we discuss a method of computing, inductively, the fundamental 
groups of the classical compact groups. The same results can also be obtained by 
using the results of Sects. 13.4-13.7; see Exercises 1—4. In all cases, we will find 
that xı (K) is commutative. This is not a coincidence; a general argument shows 
that the fundamental group of any Lie group is commutative (Exercise 7). In the 
case of a compact matrix Lie group K, the commutativity of 2,(K) also follows 
from Corollary 13.18. 

For any nice topological space, one can define higher homotopy groups 
m(X),k = 1,2,3,.... The group 2; (X) is the set of homotopy classes of maps of 
S* into X, where the notion of homotopy for maps of S* into X is analogous to that 
for maps of S! into X. Although one can define a group structure on 1; (X), this 
structure is not relevant to us. All that is relevant is what it means for 2; (X) to be 
trivial, which is that every continuous map of the k-sphere S* into X can be shrunk 
continuously to a point. We will make use of the following standard topological 
result (e.g., Corollary 4.9 in [Hat]). 


Proposition 13.5. For a d-sphere S? , m,.(S“) is trivial if k < d. 
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This result is plausible because for k < d, the image of a “typical” continuous 
map of S* into S? will not be all of $4. However, if the image of the map omits 
even one point in S”, then we can remove that point and what is left of the sphere 
can be contracted continuously to a point. 

We now introduce the topological concept underlying all the calculations in this 
section, that of a fiber bundle. 


Definition 13.6. Suppose that B and F are Hausdorff topological spaces. A fiber 
bundle with base B and fiber F is a Hausdorff topological space X together with 
a continuous map p : X — B, called the projection map, having the following 
properties. First, for each b in B, the preimage of p™! (b) of b in X is homeomorphic 
to F. Second, for every b in B, there is a neighborhood U of b such that p~! (U) is 
homeomorphic to U x F in such a way that the projection map is simply projection 
onto the first factor. 


In any fiber bundle, the sets of the form p~! (b) are called the fibers. The second 
condition in the definition may be stated more pedantically as follows. For each 
b € B, there should exist a neighborhood U of B and a homeomorphism ® 
of p-'(U) with U x F having the property that p(x) = pi(®(x)), where 
pı: U x F > U isthe map pi(u, f) = u. 

The simplest sort of fiber bundle is the product space X = B x F, with the 
projection map being simply the projection onto the first factor. Such a fiber bundle 
is called trivial. The second condition in the definition of a fiber bundle is called 
local triviality and it says that any fiber bundle must look locally like a trivial 
bundle. In general, X need not be globally homeomorphic to B x F. 

If X were a trivial fiber bundle, then the fundamental group of X would be simply 
the product of the fundamental group of the base B and the fundamental group of 
the fiber F. In particular, if X were a trivial fiber bundle and 2)(B) were trivial, 
then xı (X) would be isomorphic to 2\(F'). The following result says that if mı (B) 
and m2( B) are trivial, then the same conclusion holds, even if X is nontrivial. 


Theorem 13.7. Suppose that X is a fiber bundle with base B and fiber F. If 1(B) 
and 12(B) are trivial, then 1\(X) is isomorphic to mı (F). 


Proof. According to a standard topological result (e.g., Theorem 4.41 and Proposi- 
tion 4.48 in [Hat]), there is a long exact sequence of homotopy groups for a fiber 
bundle. The portion of this sequence relevant to us is the following: 


TLE > PY a) eB): (13.1) 


Saying that the sequence is exact means that each map is a homomorphism and 
the image of each map is equal to the kernel of the following map. Since we are 
assuming 72(B) is trivial, the image of f is trivial, which means the kernel of g is 
also trivial. Since xı (B) is also trivial, the kernel of h must be 7, (X), which means 
that the image of g is xı (X). Thus, g is an isomorphism of xı (F) with mı (X). o 
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Proposition 13.8. Suppose G is a matrix Lie group and H is a closed subgroup of 
G. Then G has the structure of a fiber bundle with base G/H and fiber H, where 
the projection map p : G —> G/H is given by p(x) = [x], with [x] denoting the 
coset xH € G/H. 


Proof. For any coset [x] in G/H, the preimage of [x] under p is the set xH C G, 
which is clearly homeomorphic to H. Meanwhile, the required local triviality 
property of the bundle follows from Lemma 11.21 and Theorem 11.22. (If we take a 
open set U in G/H as in the proof of Theorem 11.22, Lemma 11.21 tells us that the 
preimage of U under p is homeomorphic to U x H in such a way that the projection 
p is just projection onto the first factor.) o 


Proposition 13.9. Consider the map p : SO(n) > S”7! given by 
D(R) = Ren, (13.2) 


where e, = (0,...,0,1). Then (SO(n), p) is a fiber bundle with base S"~' and 
fiber SO(n — 1). 


Proof. We think of SO(n — 1) as the (closed) subgroup of SO(n) consisting of 
block diagonal matrices of the form 


go E a) 
01 
with R’ € SO(n — 1). By Proposition 13.8, SO(n) is a fiber bundle with base 
SO(n)/SO(n — 1) and fiber SO(n — 1). Now, it is easy to see that SO(n) acts 
transitively on the sphere S”~!. Thus, the map p in (13.2) maps SO(n) onto S"~!. 
Since Re, = e, if and only R € SO(n—1), we see that p descends to a (continuous) 
bijection of SO(n)/SO(n — 1) onto S"~!. Since both SO(n)/SO(n — 1) and S”7! 
are compact, this map is actually a homeomorphism (Theorem 4.17 in [Rud1]). 
Thus, SO(n) is a fiber bundle of the claimed sort. o 


Proposition 13.10. For alln > 3, the fundamental group of SO(n) is isomorphic 
to Z/2. Meanwhile, nı (SO(2)) = Z. 


Proof. Suppose that n is at least 4, so that n — 1 is at least 3. Then, by 
Proposition 13.5, xı (S”7!}) and m2(S”7!) are trivial and, so, Theorem 13.7 and 
Proposition 13.9 tell us that 2;(SO(n)) is isomorphic to 2,(SO(n — 1)). Thus, 
mxı(SO(n)) is isomorphic to 2;(SO(3)) for all n > 4. It remains to show that 
m\(SO(3)) = Z/2. This can be done by noting that SO(3) is homeomorphic to 
RP?, as in Proposition 1.17, or by observing that the map ® in Proposition 1.19 is a 
two-to-one covering map from SU(2) ~ S? onto SO(3). 

Finally, we observe that SO(2) is homeomorphic to the unit circle S ' so that 
xı (SO(2)) = Z (Theorem 1.7 in [Hat]). o 
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If one looks into the proof of the long exact sequence of homotopy groups for 
a fiber bundle, one finds that the map g in (13.1) is induced by the inclusion of F 
into X. Thus, if / is a homotopically nontrivial loop in SO(n), then after we include 
SO(n) into SO(n + 1), the loop / is still homotopically nontrivial. 

Meanwhile, we may take SU(2) as the universal cover of SO(n), with covering 
map being the homomorphism ® in Proposition 1.19 (compare Exercise 8). Now, 
if we take / to be the loop in SO(3) consisting of rotations by angle @ in the 
(x2, x3)-plane, 0 < 6 < 2x, the computations in (1.15) and (1.16) show that the 
lift of Z to SU(2) is not a loop. (Rather, the lift will start at 7 and end at —/.) 
Thus, by Corollary 13.4, l is homotopically nontrivial in SO(3). But this loop / is 
conjugate in SO(3) to the loop of rotations in the (x), x2)-plane, so that loop is also 
homotopically nontrivial. Thus, by the discussion in the previous paragraph, we may 
say that, for any n > 3, the one nontrivial homotopy class in SO(n) is represented 
by the loop 


cos 6 — sin 0 
sinô cos 


1(0) := 1 , 0<0<2x. 


(Compare Exercise 6.) 


Proposition 13.11. The group SU(n) is simply connected for all n > 2. For all 
n > 1, we have that nı (U(n)) & Z. 


Proof. For alln > 3, the group SU(n) acts transitively on the sphere S?”~!. By a 
small modification of the proof of Proposition 13.9, SU(n) is a fiber bundle with 
base S?”~! and fiber SU(n — 1). Since 2n — 1 > 3 for all n > 2, Theorem 13.7 
and Proposition 13.5 tell us that 7,(SU(n)) = 2 ,(SU(n — 1)). Since (SU 
(2)) = m (S°) is trivial, we conclude that 2, (SU(n)) is trivial for all n > 2. 

The analysis of the case of U(n) is similar. The fiber bundle argument shows 
that mı (U(n)) & 2,(U(m — 1)) for all n > 2. Since U(1) is just the unit circle 
S', we have that mı (U(1)) = Z (Theorem 1.7 in [Hat]). Thus, mı (U(1)) = Z for 
alln > 1. oO 


Proposition 13.12. For alln > 1, the compact symplectic group Sp(n) is simply 
connected. 


Proof. Since Sp(n) is contained in U(2n), it acts on the unit sphere S*”~! C C?”". If 
this action is transitive, then we can imitate the arguments from the cases of SO(n) 
and SU(n). Since 4n—1 > 3, we see that 7 (S4"7!) and 22(S$*"~!) are trivial for all 
n > 1. We conclude, then, that 7, (Sp(n)) = mı (Sp(n—1)). Since Sp(1) = SU(2), 
which is simply connected, we see that Sp(7) is simply connected for all n. 

It remains, then, to show that Sp(n) acts transitively on S*”—!. For this, it suffices 
to show that for all unit vectors u € C2", there is some U € Sp(n) with Ue, = u. 
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To see this, let wu) = u and vı = Ju, where J is the map in Sect. 1.2.8. Then vı is 
orthogonal to u; (check) and we may consider the space 


W := (span{uy, vi Yt. 


By the calculations in Sect. 1.2.8, W will be invariant under J. Thus, we can choose 
an arbitrary unit vector u2 in W and let v2 = Juz. Proceeding on in this way, we 
eventually obtain an orthonormal family 


Uy, ...,Un, Vi,.+.5 Un (13.3) 


in C*” where Ju; = v;. It is then straightforward to check that the matrix U whose 
columns are the vectors in (13.3) belongs to Sp(n), and that Ve, = u; = u. oO 


13.3 Fundamental Groups of Noncompact Classical Groups 


Using the polar decomposition (Sect. 2.5), we can reduce the computation of 
the fundamental group of certain noncompact groups to the computation of the 
fundamental group of one of the compact groups in Sect. 13.2. Theorem 2.17, for 
example, tells us that GL(n; C) is homeomorphic to U(n) x V for a certain vector 
space V (the space of n x n self-adjoint matrices). Since V is simply connected, we 
conclude that xı (GL(n; C)) = Z. Using Proposition 2.19, we can similarly show 
that zı (SL(n; C)) = mı (SU(n)) and that 2, (SL(n; R)) = mı (SO(n)). 


Conclusion 13.13. Foralln > 1, we have 
m (GL(n; C)) & m (U(n)) = Z. 
For alln > 2, the group SL(n; C) is simply connected. For n > 3, we have 
mı (SL(n; R)) & mı (SO(n)) & Z/2, 
whereas 


m (SL(2;R)) & m (SOQ) & Z. 


13.4 The Fundamental Groups of K and T 


In this section and the subsequent ones, we develop a different approach to comput- 
ing the fundamental group of a compact group K, based on the torus theorem. In this 
section, we state the main results; the proofs will be developed in Sects. 13.5, 13.6, 
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and 13.7. One important consequence of these results will be Corollary 13.20, which 
says that if K is simply connected, then every algebraically integral element is 
analytically integral. This claim allows us (in the simply connected case) to match 
up our theorem of the highest weight for the compact group K with the theorem of 
the highest weight for the Lie algebra g := c. That is to say, when K is simply 
connected, the set of possible highest weights for the representations of K (namely 
the dominant, analytically integral elements) coincides with the set of possible 
highest weights for the representations of g (namely the dominant, algebraically 
integral elements). 

In these sections, we will follow the convention of composing the exponential 
map for T with a factor of 2x, writing an element t of T as 


t=e"4 Het. 


(There are factors of 27 that have to go somewhere in the theory, and this seems 
to be the most convenient spot to put them.) Recall (Definition 12.3) that T C t 
denotes the kernel of the (scaled) exponential map: 


T= {H e t|" =7} 


and that I is invariant under the action of W. 
We begin with a simple result describing the fundamental group of T. 


Proposition 13.14. Every loop in T is homotopic in T to a unique loop of the form 
tee”, Uses I; 


with y € T. Furthermore, m (T) is isomorphic toT. 


Proof. The main issue is to prove that the (scaled) exponential map H +> e?77 isa 


covering map. Since the kernel I of the exponential map is discrete, there is some 
€ > 0 such that every nonzero element y of T has norm at least £. Let Bz/2(y) be 
the ball of radius ¢/2 around a point y € I’. Now, for any t € T, write t as e?” for 
some H € t. Then let V C T denote the set of element of the form e27” $ where 
H’ € B,/2(#), so that V is a neighborhood of t. The preimage of V under the 
exponential is the union of the balls 


Baa (H +y), yer. 


By the way e was chosen, these balls are disjoint, and each ball maps homeomor- 
phically onto V. Since we can do this for each t € T, we see that the exponential is 
a covering map. 

Since, also, t is simply connected, t is the universal cover of T. Now, every loop / 
in T based at the identity has a unique lift to a path / in t starting at 0 and ending at 
some point y in T. The theory of covering spaces tells us that two loops /; and l2 in T 
(based at 7) are homotopic if and only if /;(1) = /2(1). Meanwhile, if /(1) = y, then 
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(since t is simply connected) / is homotopic with endpoints fixed to the straight-line 
path t +> ty, showing that / itself is homotopic to t > e?7"”, as claimed. Finally, 
if we compose two loops of the form t +> e?”"”! and t +> e?7"”, the lift of the 
composite loop will be the composition of the lifts, where the lift of the second part 
of the composite loop must be taken to start at yı. Thus, the lift of the composite 
loop will go from 0 to yı and then (shifting the lift of the second loop by y1) from yı 
to yı + y2. But any path from 0 to yı + y2 in t is homotopic with endpoints fixed to 
the straight-line path t œ> e?™101+72), Thus, if we identify elements of 7 (T) with 
elements of I’, the composition operator corresponds to addition in P. oO 


We now state the first main result of this section; the proof is given in Sect. 13.7. 
Theorem 13.15. Every loop in K is homotopic to a loop in T. 


The theorem does not mean that 2,(K) is isomorphic to 2;(7), since a loop 
in T may be null homotopic in K even if it is not null homotopic in T. Indeed, 
xı(T) (which is isomorphic to T) is often very different from xı (K) (which is, for 
example, trivial when K = SU(n)). Nevertheless, the theorem gives us a useful 
way to study 2,(K), because we understand m\(T). Taking Theorem 13.15 and 
Proposition 13.14 together, we see that every loop in K is homotopic to a loop of 
the form t +> e?""’, with y € T. Thus, we are faced with a very concrete problem 
to calculate 7,;(K): We must only determine, for each y € T, whether the loop 
t +> e?"'Y is null homotopic in the compact group K. (By Proposition 13.14, such 
a loop is null homotopic in the torus T only if y = 0.) 

The condition for t +» e?7*” to be null homotopic in K turns out to be related 
to the notion of coroots. Recall that if œ is a real root for t, then (Lemma 12.8), 
the associated real coroot Hy = 2a/ (a, a) belongs to T. Thus, any integer linear 
combination of coroots also belongs to P. 


Definition 13.16. The coroot lattice, denoted 7, is the set of all integer linear 
combinations of real coroots Hy,a € R. 


We now state the second main result of this section; the proof is also in Sect. 13.7. 


Theorem 13.17. For each y €T, the loop t +> e?"*” is null homotopic in K if and 
only if y belongs to the coroot lattice I. 


If we combine this result with Theorem 13.15, we obtain the following descrip- 
tion of the fundamental group of K. 


Corollary 13.18. The fundamental group of K is isomorphic to the quotient group 
T/I, where the quotient is of commutative groups. 


Proof. By Theorem 13.15, every loop in K is homotopic to a loop of the form t b> 
e?™tY with y € T. Under this correspondence, composition of loops corresponds to 
addition in T. By Theorem 13.17, two loops of the form t +> e?”*”! and t > e?7*”2 
are homotopic if and only if yı — y2 belongs to 7. Thus, mı(K) may be identified 
with the set of cosets of J in T. oO 
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We now consider three examples. The first two are familiar friends, the groups 
SO(5) and SU(3). The third is the projective unitary group PSU(3). In general, 
PSU (n) is the quotient of SU (n) by the subgroup consisting of whichever multiples 
of the identity have determinant 1 in U(7). In the case n = 3, a matrix of the form 
ei? I has determinant 1 if and only if e? = 1. Thus, 


PSU(3) = SU(3)/{7, e PI, ef PI}, (13.4) 


We can represent PSU(3) as a matrix group by using the adjoint representation. It 
is easy to check that the center of SU(3) is the group being divided by on the right- 
hand side of (13.4); thus, the image SU(3) under the adjoint representation will be a 
subgroup of GL(sI(3; C)) isomorphic to PSU(3). Since the center of the Lie algebra 
su(3) is trivial, the Lie algebra version of the adjoint representation is faithful; thus, 
we may identify the Lie algebra of PSU(3) with the Lie algebra of SU(3). 


Example 13.19. If K = SO(5), then P/I = Z/2. If K = SU(3), then T/I is 
trivial. Finally, if K = PSU(3), then F/I = Z/3. 


The verification of the claims in Example 13.19 is left to the reader (Exercise 5). 
Corollary 13.18 together with Example 5 give another way of computing the funda- 
mental groups of SO(5) (Z/2) and SU(3) (trivial), in addition to Proposition 13.10. 

Figures 13.1 and 13.2 show I and J in the case of the groups SO(5) and PSU(3). 
The black dots indicate points in Z, whereas white dots indicate points in I that are 
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Fig. 13.1 The coroot lattice (black dots) and the kernel of the exponential mapping (black and 
white dots) for the group SO(5) 
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Fig. 13.2 The coroot lattice (black dots) and the kernel of the exponential mapping (black and 
white dots) for the group PSU(3) 


not in J. In the case of SO(5), it is easy to see that any two white dots differ by an 
element of the coroot lattice, showing that there is exactly one nontrivial element of 
T/1. In the case of PSU(3), note that the elements yı and y2 = 2y; are not in J, but 
3y: is in Z, showing that [y1] is an element of order 3 in T/Z. The reader may verify 
that every element of I is either in Z, differs from yı by an element of Z, or differs 
from y2 by an element of 7. The situation for SU(3) is similar to that for PSU(3); 
the coroot lattice 7 does not change, but T is now equal to 7, so that '/J is trivial. 
For now, the reader may regard the lines in the figures as merely decorative; these 
lines will turn out to make up the “Stiefel diagram” for the relevant group. (See 
Sect. 13.6.) 


Corollary 13.20. If K is simply connected, then every algebraically integral 
element is analytically integral. 


Proof. In light of Theorem 13.17, K is simply connected if and only if J = T. If 
A € tis algebraically integral, then (A, Hy) € Z for alla, where Hy = 2a/ (a, œ) 
is the real coroot associated to a. It follows that (A, y) € Z for every element of 7, 
the set of integer linear combinations of coroots. Thus, if K is simply connected, 
(A,v) € Z for every element of Z = T, which means that A is analytically 
integral. o 


We may offer a completely different proof of Corollary 13.20, as follows. We 
first observe that if K is simply connected, then by Proposition 7.7, the complex 
Lie algebra g := €c is semisimple. Then let À be an algebraically integral element. 
Since the sets of analytically integral and algebraically integral elements are both 
invariant under the action of W, it is harmless to assume that A is dominant. Thus, 
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by the results of Chapter 9, there is a finite-dimensional irreducible representation 
(x, Vi) of g := tc with highest weight A. Since K is simply connected, Theorem 5.6 
then tells us that there is an associated representation II of K acting on V}. Thus, 
A is a weight for a representation of the group K, which implies (Proposition 12.5) 
that A is analytically integral. 

Although it is mathematically correct, the preceding argument may be considered 
as “cheating,” since it depends on the whole machinery of Verma modules (to 
construct the representations of g) and on the Baker-Campbell—Hausdorff formula 
(to prove Theorem 5.6). In the subsequent sections, we will use techniques similar 
to those in the proof of the torus theorem to prove Theorem 13.17 and thus to give a 
more direct (but not easy!) proof of Corollary 13.20. 


Corollary 13.21. If K is simply connected, the element ô (half the sum of the real, 
positive roots) is analytically integral. 


Proof. According to Proposition 8.38 (translated into the language of real roots), the 
element 6 is algebraically integral. But by Corollary 13.20, if K is simply connected, 
every algebraically integral element is analytically integral. o 


Since it is easy to do so, we will immediately prove one direction of The- 
orem 13.17, namely that if y is in the coroot lattice 7, then t > e77"” is 
homotopically trivial in K. 


Proof of Theorem 13.17, one direction. We assume at first that y = Hg, a single 
real coroot. Then by Corollary 7.20, there is a homomorphism ¢@ of su(2) into € 
such that @ maps the element iH = diag(i,—i) in su(2) to the real coroot Hy. 
Since SU(2) is simply connected, there exists a homomorphism ® : SU(2) > K 
such that ®(e*) = e$% for all X € su(2). 

Consider, then, the loop / in SU(2) given by 


2nit 
I(t) = e7" = ic r LA 0<rt<l. 
e 


Observe that ®(/(t)) = e?”"«, so that the image of / under © is the relevant loop 
in K. Since SU(2) is simply connected, there is a family /, of loops connecting / to 
a constant loop in SU(2). Thus, the loops 


t |> O(1;(T)) 


constitute a homotopy of our original loop to a point in K. 

Now, if y is an integer linear combination of coroots, then by Proposition 13.14, 
the loop t +> e?”*” is homotopic (in T and thus in K) to a composition of loops 
of the form t > e?e for various coroots a. Since each of those loops is null 
homotopic in K, so is the loop t + e?”"”, Oo 
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We will study the topology of K by writing every element x of K as 


x = yty, 


with y € K and t € T. It is then convenient to write the variable ft in the preceding 
expression in exponential coordinates. Thus, we may consider the map Y : t x 
(K/T) — K given by 


W(H, [y]) = ye” yt. (13.5) 


(We continue to scale the exponential map for T by a factor of 27.) The torus 
theorem tells us that Y maps onto K. On the other hand, the behavior of Y is easiest 
to understand when the differential Y, is nonsingular. 

We now introduce a notion of “regular”elements in K; we will see (Proposi- 
tion 13.24) that Y,(H, [y]) is invertible if and only if x := W(H, [y]) is regular. It 
will turn out that the fundamental group of K is the same as the fundamental group 
of the set of regular elements in K. 


Definition 13.22. If x € K is contained in a unique maximal torus, x is regular; 
if x is contained in two distinct maximal tori, x is singular. The set of regular 
elements in K is denoted Kreg and the set of singular elements is denoted King. 


If, for example, t € T generates a dense subgroup of T, then the only maximal 
torus containing yty! is yTy`!. Thus, for such a f, the element yty—! is regular. 
As we will see, however, yty™! can be regular even if t does not generate a dense 
subgroup of 7. 


Proposition 13.23. Suppose x € K has the form x = yty"! witht € T andy € K. 
If there exists some X € £ with X ¢ t such that 


Ad (X) = X, 


then x is singular. If no such X exists, x is regular. 


Proof. The condition of being regular is clearly invariant under conjugation; that is, 
x is regular if and only if ¢ is regular. Suppose now that there is some X ¢ t with 
Ad;(X) = X. Then t commutes with e™* for all rt € R. Applying Lemma 11.37 
with S = {e™*},¢p, there is a maximal torus S’ containing both ¢ and {e™*} ep. 
But then the Lie algebra s’ of S’ must contain X, which is not in t, which means 
that S” Æ T. Thus, t (and therefore, also x) is singular. 

In the other direction, if x (and therefore, also, t) is singular, then there is a 
maximal torus S” Æ T containing t. But we cannot have S’ C T, or else S’ would 
not be maximal. Thus, there must be some X in the Lie algebra s’ of S’ that is not 
in t. But since t € S” and S’ is commutative, we have Ad; (X) = X. o 
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Proposition 13.24. An element x = ye?"" y7! is singular if and only if there is 


some root a for which 


(a, H) €Z. 
It follows that x = ye™” y~ is singular if and only if Vs is singular at the point 
(H, [y]). 


Note that for each fixed a € R andn € Z, the set of H in t for which (a, H) =n 
is a hyperplane (not necessarily through the origin). 


Proof. By Proposition 13.23, x is singular if and only if Ad,2:1 (X) = X for some 
X ¢ t. Now, the dimension of the eigenspace of Ad,2.v with eigenvalue 1 € R is 
the same whether we work over R or over C. The eigenvalues of Ad,271 over C are 
1 (on b) together with the numbers of the form 


e?™i (eH) a ER, 


(from the root space gq). Thus, the dimension of the l-eigenspace is greater than 
dim 6 if and only if e27/(%”) = 1 for some a, which holds if and only if (a, H) € Z. 

Meanwhile, the map W is just the map ® in Definition 11.25, composed with the 
exponential map for 7. Since T is commutative, the differential of the exponential 
map for T is the identity at each point. Thus, using Proposition 11.27, we see that 
Y (H, [y]) is singular if and only if the restriction of Ad,-271 to t+ (or, equivalently, 
to (t+)c) has an eigenvalue of 1. Since the eigenvalues of Ad,-2:1 on (t+)c are the 
numbers of the form e~27/(-#) | we see that Vx (H, [y]) is singular if and only if 
(a, H) € Z for some a. Oo 


Definition 13.25. An element H of t is regular if for all œ € R, the quantity 
(a, H) is not an integer. Otherwise, H is singular. The set of regular elements in t 
is denoted treg. 


A key issue in the proof of Theorem 13.17 is to understand the extent to which the 
map Y in (13.5) fails to be injective. There are two obvious sources of noninjectivity 
for Y. The first is the kernel of the exponential map; clearly if y € I, then 


(A + y, [x]) = YA, [x]. 
Meanwhile, if w € W and z € N(T) represents w, then 


P(w 7 H, [xz] = xz lett) -1y71 
= ert xT! 
= (H, [x]), 


since z~' represents w~'. We now demonstrate that if we restrict Y to treg x (K/T), 


these two sources account for all of the noninjectivity of Y. 
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Proposition 13.26. Suppose (H, [x]) and (H', [x’]) belong to treg x (K/T). Then 
W(A, [x]) = Y(R’, [x’]) if and only if there exist some w = [z| in W and some 
y €T such that 


H'=w-H+y 
[x] = [xe]. (13.6) 


Here |z] denotes the coset containing z € N(T) in W = N(T)/T. Furthermore, 
if the elements in (13.6) satisfy H' = H and [x'] = [x], then y = 0 and w is the 
identity element of W. 


Note that if z € N(T) and t € T, then 
xtz | = xz! (ztz7), 


where ztz_! € T. Thus, [xtz~'] and [xz~!] are equal in K/T. That is to say, for 
a fixed z € N(T), the map [x] +> [xz7!] is a well-defined map of K/T to itself. 
A similar argument shows that this action depends only on the coset of z in N(T)/T. 


Proof. If H’ and x’ are as in (13.6), then by the calculations preceding the statement 
of the proposition, we will have U(H’, [x’]) = W(A, [x]). In the other direction, if 
W(A, [x]) = Y(R’, [x’]), then 


la 
xe H x! = x! ect Gy, 
which means that 


eft = zler" z, (13.7) 
where z = (x’)7! x. 

Now, the relation (13.7) implies that e?”” belongs to the torus z~!Tz. Since H € 
treg, it follows from Proposition 13.24 that z!Tz = T, that is, that z € N(T). Then, 
if w = [z], we have 


1.47 
eft = ecm H ; 


From this, we obtain e?” W lH! —H) — I, which means that w™! - H’ — H belongs 
to T. Since [ is invariant under the action of W, the element y := H’ — w - H also 
belongs to I’, and we find that H’ = w- H + y and x’ = xz_!, as claimed. 
Finally, if H’ = H and [x’] = [x], then x’ and x belong to the same coset 
in K/T, which means that z! must be in T. Thus, w is the identity element in 
W = N(T)/T. But once w = e, we see that H’ = H only if y = 0. o 


We now come to a key result that is essential to the proofs of Theorems 13.15 
and 13.17. 
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Theorem 13.27. The fundamental groups of K and Kyeg are isomorphic. Specif- 
ically, every loop in K is homotopic to a loop in Kyeg and a loop in Kyeg is null 
homotopic in K only if it is null homotopic in Kyeg. 


To prove this result, we will first show that the singular set in K is “small,” 
meaning that it has codimension at least 3. (In the case K = SU(2), for example, 
the singular set is just {7,—J}, so that King has dimension 0, whereas K had 
dimension 3.) We will then argue as follows. Let n be the dimension of K and 
suppose E and F are subsets of K of dimension k and /, respectively. If k +1 <n, 
then “generically” F and F will not intersect. If E and F do intersect, then (we will 
show) it is possible to perturb F slightly so as to be disjoint from Æ. We first apply 
this result with E = King and F being a loop in K. Then E has dimension at most 
n — 3 while F has dimension 1, so the loop F is homotopic to a loop that does not 
intersect King. We next apply this result with E = Kging and F being a homotopy 
of aloop/ C Keg. Then £ still has dimension at most n — 3 while F has dimension 
2 (the image of a square), so the homotopy can be deformed to a homotopy that 
does not intersect King. In the remainder of this section, we will flesh out the above 
argument. 


Lemma 13.28. There exist finitely many smooth compact manifolds M,,...,Mw 
together with smooth maps f; : M; — K such that (1) each M; has dimension at 
most dim K — 3, and (2) each element of Ksing is in the image of f; for some j. 


Proof. Since each root a is analytically integral (Proposition 12.7), there exists a 
map fy: T — S! such that 


a) = e2tila,H) (13.8) 


for all H € t. Clearly, fy is actually a homomorphism of T into S'. Let Ty be the 
kernel of fy, so that Ty is a closed subgroup of T. The Lie algebra of Ty is the set 
of H e t with (a, H) = 0; thus, Ty has dimension one less than the dimension of 
T. (Note that 7, may not be connected.) For each H € t, we see from (13.8) that 
e?"4 belongs to Ty if and only if (a, H} is an integer. Thus, by the torus theorem 
and Proposition 13.24, each singular element in K is conjugate to an element of Ty, 
for some a. 

Now fix a root œ and let C(7,) denote the centralizer of Ty, that is, the set of 
x € K such that x commutes with every element of Tą. Then C(7y) is a closed 
subgroup of K, and the Lie algebra of C(T,,) consists of those X € € such that 
Ad,(X) = X for allt € Ty. Suppose now that X, belongs to the root space g and 
t =e’ belongs to Ty. Then 


Adorn (Xa) — e2mady (Xa) _ eilh) y =X), 
since f(e?7) = e?7'(¢-#) — 1, by assumption, and similarly with X, replaced by 
Ya := Xf. Thus, the Lie algebra of C (Tx) will contain t and at least two additional 
elements, Xe — Yy and i(Xq + Yq), that are independent of each other and of t. 
(Compare Corollary 7.20.) We conclude that the dimension of C(7,,) is at least 
dim Ty + 2. 
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Now, the map fy : Ty x K — K given by 
falt, x) = xtX! 


descends to a map (still called fo) of Ty x (K/C(Tq)) into K. Furthermore, we 
compute that 


dim(Ty x (K/C(Ty))) = dim T, + dim K — dim C(Ty) 
< dimT — 1 + dim K —dimT — 2 
= dim K — 3. 


Since, as we have said, every singular element is conjugate to an element of some 
Ty, we have proved the lemma, with the M’s being the manifolds Ty x (K/C(Tẹa)) 
and the f’s being the maps fy. o 


Lemma 13.29. Let M and N be compact manifolds and let D be a compact 
manifold with boundary, with 


dim M + dim D < dim N. 


Let f : M > N and g : D —> N be smooth maps. Suppose E is a closed subset 
of D such that g(E) is disjoint from f (M). Then g is homotopic to a map g' such 
that g' = g on E and such that g'(D) is disjoint from f(M). 


Since g(£) is already disjoint from f(M), it is plausible that we can deform 
g without changing its values on Æ to make the image disjoint from f(M). Our 
proof will make use of the following result: If X and Y are smooth manifolds with 
dim X < dim Y and f : X — Y is a smooth map, then the image of X under f isa 
set of measure zero in Y. (Note that we do not assume fx is injective.) This result is 
a consequence of Sard’s theorem; see, for example, Corollary 6.11 in [Lee]. We will 
use this result to show that g can be moved locally off of f (M); a finite number of 
these local moves will then produce the desired map g’. 


Proof. Step 1: The local move. For x € D \ E, let us choose a neighborhood U 
of g(x) diffeomorphic to R”, where n = dim N. We may then define a map 


h: f7'(U) x g7! (U) > U 
by 
hm, x) = f(m) — g(x), 
where the difference is computed in R” œ U. By our assumption on the 


dimensions, the image of h is a set of measure zero in U. Thus, in every 
neighborhood of 0, we can find some p that is not in the image of h. 
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Suppose that W is any neighborhood of x in D such that W is contained in 
g`! (U) and W is disjoint from E. Then we can choose a smooth function y on 
D such that y equals 1 on W but such that y equals 0 both on E and on the 
complement of g7! (U). Let us define a family of maps gs : D —> N by setting 


gsx) = g(x) + 5x(x)p, 0<s<1 


for x € g7! (U) and setting gs (x) = g(x) for x ¢ g`! (U). 
When s = 1, and x € W, we have 


fm) — gi(x) = fm) — g(x) — p = h(m, x) — p. 


Since h never takes the value p, we see that f(m) does not equal g(x); thus, 
gi(W) c U is disjoint from f(M). We conclude that gı is a map homotopic to 
g such that gı = g on E and gı (W) is disjoint from f(M). Furthermore, since 
p can be chosen to be as small as we want, we can make g; uniformly as close 
to g as we like. 


Step 2: The global argument. Choose a neighborhood V of E in D such that 
g(V) is disjoint from f(M), and let K be the complement of V in D. For each x 
in K, we can find a neighborhood U C N of g(x) such that U is diffeomorphic to 
R”. Now choose a neighborhood W of x so that W is contained in g~!(U) but W 
is disjoint from E. Since K is compact, there is some finite collection x;,..., xy 
of points in K such that the associated open sets W,,..., Wy cover K. Thus, 
each W; is disjoint from E and is contained in a set of the form g~!(U;), with 
U; C N diffeomorphic to R”. 

By the argument in Step 1, we can find a map gı homotopic to g such that 
gı = g on E and gı (W1) is disjoint from f(M), but such that g is as close as 
we like to g. Since g(V) and f(M) are compact, the distance between g (7) and 
F(M) (with respect to an arbitrarily chosen Riemannian metric on N) achieves 
a positive minimum. Therefore, if we take gı close enough to g, then gi(V) 
will still be disjoint from f(M). We can similarly ensure that gı still maps the 
compact set W; into U; for j = 2,..., N. 

We may now perform a similar perturbation of gı to a map g2 such that g2 = 
gı = g on E but such that go(W2) is disjoint from f(M). By making this 
perturbation small enough, we can ensure that g2 has the same properties as 
g1: First, go(V) is still disjoint from f(M), second, g2(W,) is still disjoint from 
F(M), and third, g2(W;) is still contained in U; for j = 3,..., N. Proceeding 
on in this fashion, we eventually obtain a map gy homotopic to g such that 
gy = gon E and such that gy (V) and each gn (W;) are disjoint from f(M). 
Then g’ := gy is the desired map. o 


Proof of Theorem 13.27. In Lemma 13.28, it is harmless to assume that each M; 
has dimension dim K — 3, since if any M; has a lower dimension, we can take a 
product of M; with, say, a sphere dimension of an appropriate dimension, and then 
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make f; independent of the sphere variable. Once this is done, we can let M be the 
disjoint union of the M;’s and let f be the map equal to f; on M;. That is to say, 
the singular set is actually in the image under a smooth map f of single manifold 
M of dimension dim K — 3. 

If / is any loop in K, it is not hard to prove that / is homotopic to smooth loop. 
(See, for example, Theorem 6.26 in [Lee].) In fact, this claim can be proved by an 
argument similar to the proof of Lemma 13.29; locally, any continuous map from 
R” to R” can be approximated by smooth functions (even polynomials), and one 
can then patch together these approximations, being careful at each stage not to 
disrupt the smoothness achieved at the previous stage. We may thus assume that / 
is smooth and apply Lemma 13.29 with D = S',N = K, and E = Ø. Since 
dim S! + dim M < dim K, the lemma says that we can deform / until it does not 
intersect f(M) D Keing. 

Next, suppose / is a loop in Kreg, which we can assume to be smooth. If / is 
null homotopic in Kreg, it is certainly null homotopic in K. In the other direction, 
suppose / is null homotopic in K. We think of the homotopy of / to a point as a map 
g of a 2-disk D into K, where the boundary of D corresponds to the original loop 
l and the center of D correspond to the point. After deforming g slightly—by the 
same argument as in the previous paragraph—we may assume that g is smooth. We 
now apply Lemma 13.29 with D equal to the 2-disk and E equal to the boundary 
of D. The lemma tells us that we can deform g to a map g’ that agrees with g on 
the boundary of D but such that g’(D) is disjoint from f(M) D King. Thus, g’ is 
a homotopy of / to a point in Kreg. oO 


13.6 The Stiefel Diagram 


In this section, we look more closely at the structure of the hyperplanes in t where 
(a, H) = n, which appear in Proposition 13.24. The main result of the section is 
Theorem 13.35, which constructs a cover of the regular set Kreg. 


Definition 13.30. For each n € Z anda € R, consider the hyperplane (not 
necessarily through the origin) given by 


Lon = {H €t|(a,H) =n}. 


The union of all such hyperplanes is called the Stiefel diagram. A connected 
component of the complement in t of the Stiefel diagram is called an alcove. 


In light of Proposition 13.24 and Definition 13.25, the complement of the Stiefel 
diagram is just the regular set teg in t. Figures 13.3, 13.4, and 13.5 show the Stiefel 
diagrams for Az, B2, and Gz, respectively. In each figure, the roots are indicated 
with arrows, with the long roots being normalized to have length /2. The black 
dots in the figure indicate the coroots and one alcove is shaded. Note that since 
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Fig. 13.3 The Stiefel 
diagram for A2, with the 
roots normalized to have 
length \/2. The black dots 
indicate the coroots 


Fig. 13.4 The Stiefel diagram for B2, with the long roots normalized to have length /2. The 
black dots indicate the coroots 


(a, Hy) = 2, if we start at the origin and travel in the œ direction, the coroot Ha 
will be located on the second hyperplane orthogonal to a. 

Let us check the correctness of, say, the diagram for G2. Suppose @ is a long root, 
which we normalize so that (œ, œ} = 2. Then a/2 belongs to the line Ly1, so that 
La. is the (unique) line orthogonal to œ passing through a/2. Similarly, La, is the 
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Fig. 13.5 The Stiefel diagram for G2, with the long roots normalized to have length ./2. The 
black dots indicate the coroots 


line orthogonal to œ passing through na/2. Suppose, on the other hand, that «œ is a 
short root, which we must then normalize so that (œ, œ) = 2/3. Then 3a/2 belongs 
to Ly,1, so that Ly; is the line orthogonal to œ passing through 3@/2 and Ly, is the 
line orthogonal to a passing through 3na@/2. Meanwhile, for a long root a, we have 
H, = 2a/2 = a, whereas for a short root, we have Hy = 2a/(2/3) = 3a. These 
results agree with what we see in Figure 13.5. 

It will follow from Proposition 13.34 that every alcove is isometric to every other 
alcove. Furthermore, if g := c is simple (i.e., if the root system R is irreducible), 
each alcove is a “simplex,” that is, a bounded region in a k-dimensional space 
defined by k + 1 linear inequalities. Thus, for rank-two simple algebras, each alcove 
is a triangle, as can be seen from Figures 13.3, 13.4, and 13.5. In general, the 
structure of the alcoves for a simple algebra is described by an extended Dynkin 
diagram; see Exercises 11, 12, and 13. 


Proposition 13.31. The Stiefel diagram is invariant under the action of W and 
under translations by elements of I. 


Proof. Since W permutes the roots, invariance of the Stiefel diagram is evident. 
Meanwhile, if y is in the kernel T of the exponential map, the adjoint action of e?”” 
on g is trivial. Thus, for X € gy, we have 


X = Adar (X) = 27 X, 


which means that (a, y) is an integer, for all œ € R. From this we can easily see 
that if H is in the Stiefel diagram, so H + y. o 
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Recall that an affine transformation of a vector space is a transformation that 
can be expressed as a combination of a linear transformation and a translation. 


Proposition 13.32. Let I x W denote the set of affine transformations of t that can 
be expressed in the form 


Htiew:-H+y 


for some y € T and some w € W. Then T x W forms a group under composition, 
with the group law given by 


(yw): (yw) = (y +w- y’, ww’). 
The group I x W is a semidirect product of T and W, with I’ being the normal 


factor. 


Proof. We merely compute that 
wiw-H+y')+y=(ww')-A+ytw-y’, 


so that the composition of the affine transformations associated to (y, w) and (y’, w’) 
is the affine transformation associated to (y + w- y’, ww’). Oo 


We now introduce the analog of the Weyl group for the Stiefel diagram. Even 
if a hyperplane V C t does not pass through the origin, we can still speak of the 
reflection s about V, which is the unique affine transformation of t such that 


s(H + H) = H - H' 


whenever H is in V and H’ is orthogonal to V. 


Definition 13.33. The extended Weyl group for K relative to t is the group of 
affine transformations of t generated by reflections about all the hyperplanes in the 
Stiefel diagram of t. 


We now establish some key properties of the extended Weyl group. Recall from 
Definition 13.16 the notion of the coroot lattice 7. 


Proposition 13.34. 1. The extended Weyl group equals I x W, the semidirect 
product of the ordinary Weyl group W and the coroot lattice I. 
2. The extended Weyl group acts freely and transitively on the set of alcoves. 


Note that since J C T is invariant under the action of W, the extended Weyl 
group J x W is a subgroup of the group I x W in Proposition 13.32. That is to say, 
the group law in J x W is the same as in Proposition 13.32, although J x W will, 
in general, be a proper subgroup of I x W. 


Proof: For Point 1, let Vy,a@ € R, be the hyperplane through the origin orthogonal 
to a. Since (a, Hy) = 2, we see that the hyperplane Ly, in Definition 13.30 is the 
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translate of Vy by nH,/2. We can then easily verify that the reflection sy, about 
Lan is given by 


San (H) = So(H) + nHa, (13.9) 


where Sg is the reflection about Vy. Thus, Sw.n is a combination of an element sy of 
W and a translation by an element of 7. It follows that the extended Weyl group 
is contained in Z x W. In the other direction, the extended Weyl group certainly 
contains W, since the Stiefel diagram contains the hyperplanes through the origin 
orthogonal to the roots. Furthermore, from (13.9), we see that the composition of 
Sa,1 and Sao is translation by H,. Thus, the extended Weyl group contains all of 
I xW. 

For Point 2, we first argue that the orbits of Z x W in t do not have accumulation 
points. For any H € t, the orbit of H is the set of all vectors of the form w- H + y, 
with w € W and y € J. Since W is finite and J contains only finitely many points in 
each bounded region, the orbit of H also contains only finitely many points in each 
bounded region. Once this observation has been made, we may now repeat, almost 
word for word, the proof that the ordinary Weyl group acts freely and transitively 
on the set of open Weyl chambers. (Replace the hyperplanes orthogonal to the roots 
with the hyperplanes L,, and replace the open Weyl chambers with the alcoves.) 
The only point where a change is necessary is in the proof of the transitivity of the 
action (Proposition 8.23). To generalize that argument, we need to know that for 
each H and H’ in t, the orbit (J x W) - H’ contains a point at minimal distance 
from H. Although the extended Weyl group is infinite, since each orbit contains 
only finitely many points in each bounded region, the result still holds. The reader is 
invited to work through the proofs of Propositions 8.23 and 8.27 with the ordinary 
Weyl group replaced by the extended Weyl group, and verify that no other changes 
are needed. o 


Theorem 13.35. The map Y : teg x (K/T) — Kreg is a covering map. 
Furthermore, if A C t is any one alcove, then Y maps A x (K/T) onto Keg and 


Y: Ax (K/T) —> Ke 


is also a covering map. 


Recall the definition of Y in (13.5). We will see in the next section that K/T 
is simply connected. Since each alcove A is a convex set, A is contractible and 
therefore simply connected. We will thus conclude that A x (K/T) is the universal 
cover Of Kreg. 


Proof. Recall the map ® : T x (K/T) —> K given by ®(t,[x]) = xtx!, and 
let Treg = Kreg N T. Since the exponential map for T is a local diffeomorphism, it 
follows from Proposition 13.24 that ® is a local diffeomorphism on Treg x (K/T). 
Furthermore, by a trivial extension of Proposition 13.26, we have ®(t,[x]) = 
®(t’, [x’]) if and only if there is some z € N(T) such that t’ = ztz™! and x’ = xz™!. 
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Finally, if W = N(T)/T acts on Treg x (K/T) by 
w(t, BD = ez, be") 


for each w = [z] in W, then this action is free on K/T and thus free on Tyeg x (K/T). 

Fix some y in Keg and pick some (f,[x]) in Tieg x (K/T) for which 
®(t,[x]) = y. Then since W is finite and acts freely, we can easily find a 
neighborhood U of (t, [x]) such that w- U is disjoint from U for all w +Æ e in 
W. It follows that the sets w- U are pairwise disjoint and thus that © is injective 
on each such set. Now, since ® is a local diffeomorphism, ®(U) will be open in 
Keg. The preimage of V is then the disjoint union of the sets w - U and the local 
diffeomorphism ® will map each w - U homeomorphically onto V. We conclude 
that ® is a covering map of Treg x (K/T) onto Kreg. 

Meanwhile, as shown in the proof of Proposition 13.14, the exponential map for 
T is a covering map. Thus, the map 


(H, [xD +(e", [x]) 


is a covering map from trey X (K/T) onto Treg x (K/T). Since the composition of 
covering maps is a covering map, we conclude that Y : teg x (K/T) onto Kreg is a 
covering map, as claimed. 

Finally, Proposition 13.34 shows that each point in treg can be moved into A by 
the action of I xW C T xW. Thus, W actually maps Ax (K/T) onto Kreg. For each 
Y € Kreg, choose a neighborhood V of y so that ¥~'(V) is a disjoint union of open 
sets Uy mapping homeomorphically onto V. By shrinking V if necessary, we can 
assume V is connected, in which case the U,’s will also be connected. Thus, each 
Ux is either entirely in A x (K/T) or disjoint from A x (K/T). Thus, if we restrict 
W to A x (K/T), the preimage of V will now consist of some subset of the U,’s, 
each of which still maps homeomorphically onto V, showing that the restriction of 
W to A x (K/T) is still a covering map. o 


13.7 Proofs of the Main Theorems 


We now have all the necessary tools to attack the proofs of our main results. Here is 
an outline of our strategy in this section. We will first show that the quotient K/T is 
always simply connected. On the one hand, the simple connectivity of K/T leads 
to a proof of Theorem 13.15, that every loop in K is homotopic to a loop in T. On 
the other hand, the simple connectivity of K/T means that the set A x (K/T) in 
Theorem 13.35 is actually the universal cover of K,eg. Thus, to determine 7 (K) = 
m1(Kreg), we merely need to determine how close to being injective the covering 
map Y : A x (K/T) > Kreg is. 

Now, according to Proposition 13.26, the failure of injectivity for the full map 
W : teg x (K/T) is due to the action of the group I x W. If we restrict Y to 
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Ax (K/T), then by Proposition 13.34, we have eliminated the failure of injectivity 
due to the subgroup J x W C T x W. Thus, the (possible) failure of injectivity of 
W on A x (K/T) will be measured by the extent to which 7 fails to be all of T. 


Proposition 13.36. The quotient manifold K/T is simply connected. 


Proof. Let [x(t)] be any loop in K/T. Let H be a regular element in T and consider 
the loop / in Keg given by 


I(t) = x(t)e™" x(x)". 
Now let t(s) be any path in T connecting e?”” to J, and consider the loops 


I(t) := x(S) x(t. 


Clearly, /; is a homotopy of / in K to the constant loop at J. 

Thus, for any loop [x(t)] in K/T, the corresponding loop (t) in Keg is null 
homotopic in K. We now argue that [x (t)] itself is null homotopic in K/T. As a first 
step, we use Theorem 13.27 to deform the homotopy l, to a homotopy /’ shrinking 
l to a point in Kyeg. Now, the map W is a covering map and the loop (H, [x(z)]) 
is a lift of / to A x (K/T). Thus, as a second step, we can lift the homotopy J’ 
to A x (K/T) to a homotopy /” shrinking (H, [x(r)]) to a point in A x (K/T). 
(See Proposition 13.3.) Finally, as a third step, we can project the homotopy l} from 
A x (K/T) onto K/T to obtain a homotopy shrinking [x(t)] to a point in K/T. 0 


Proposition 13.37. Every loop in K is homotopic to a loop in T. 


Proof. By Proposition 13.8, the group K is a fiber bundle with base K/T and fiber 
T. Suppose now that / is a loop in K and that /’(r) := [/(t)] is the corresponding 
loop in K/T. Since K/T is simply connected, there is a homotopy /! shrinking /’ 
to a point in K/T. Furthermore, since K/T is connected, it is harmless to assume 
that /’ shrinks to the point [Z] in K/T. Now, fiber bundles are known to have the 
homotopy lifting property (Proposition 4.48 in [Hat]), which in our case means that 
there is a homotopy /, in K such that lọ = / and such that [Zs (t)] = }/ (t) for all t 
and s. Since l{ is a constant loop at [7], the loop /; lies in T. o 


Lemma 13.38. Suppose y € T buty ¢ I. Then there exist y' in I andw in W such 
that the affine transformation 


Hew-(A+y+y’) (13.10) 


maps A to itself but is not the identity map of A. 


Proof. By Proposition 13.31, translation by y maps the alcove A to some other 
alcove A’. Then by Proposition 13.34, there exists an element of the extended Weyl 
group that maps A’ back to A. Thus, there exist y’ € J and w € W such that 
the map (13.10) maps A to itself. If this map were the identity, then translation by 
y would be (the inverse of) some element of J x W, which would mean that y 
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(0; e O 


Fig. 13.6 The element y is in the kernel of the exponential mapping for PSU(3), but is not in the 
coroot lattice. If we apply a rotation by 27/3 to H + y, we obtain an element H’ in the same 
alcove as H 


would have to be in J, contrary to our assumption. Thus, the map in (13.10) sends 
A to itself and is not the identity map on t. But an affine transformation is certainly 
determined by its restriction to any nonempty open set, which means that the map 
cannot be the identity on A. o 


Suppose now that y € T but y ¢ I. Let us choose y’ € I andw € W as in 
Lemma 13.38. We may then choose H € A so that H’ := w- (H + y + y’) lies in 
A but is distinct from H. (See Figure 13.6 in the case of the group PSU(3).) Let p 
denote the path in A given by 


P(t) = H +r(H'— H), 0<rt<l1l. 


Now choose x € N(T) representing w, let x(-) be a path connecting 7 to x in K, 
and define a path q in A x (K/T) by 


q(t) = (p(t). FOD. (13.11) 
Note that p(1) # p(0) and thus, certainly, g(1) 4 q(0). 


Lemma 13.39. Let q(t) be the path in A x (K/T) given by (13.11). Then the path 


t> W(q(t)) 


is a loop in Kyeg and this loop is homotopic in K to the loop 


TR erry, 
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Proof. On the one hand, we have 
Y(q(0)) = e”. 
On the other hand, since x~! represents w™!, we have 


Pq) =x ex 


= e?" (H+y+y’) 


since y € I and y’ € J C T. Thus, Y o q is a loop in K,eg, as claimed. 
Meanwhile, 


(q(t) = exp {27 x(t)! px}, 


where x(0)~! p(0)x(0) = H and x(1)™! p(1)x(1) = H + y + y’. Since the vector 
space € is simply connected, the path 


t > x(t)! p(t) x(t) 


in € is homotopic with endpoints fixed to the straight-line path connecting H to 
H + y + y’, namely 


te H +r(y +y’). 


Since the exponential map for K is continuous, we see that Y o q is homotopic in 
K to the loop 


li 
tT e2tH o2atlyty $ 


We may then continuously deform e?”” to the identity, showing that Y o q is 
homotopic in K to the loop t > e277" +’, 

Finally, in light of Proposition 13.14, the loop t > e27°7+Y’) is homotopic to 
the composition (in either order) of the loop t +> e?”*” and the loop t t+ erty’ 
But since y’ belongs to 7, we have already shown in Sect. 13.4 that this second loop 
is null homotopic in K, showing that Y o q is homotopic in K to t  e?”"Y, as 
claimed. o 


It now remains only to assemble the previous results to finish the proof of 
Theorem 13.17. 
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Proof of Theorem 13.17, the other direction. We showed in Sect. 13.4 that if y € I 
the loop t > e?”*Y is null homotopic in K. In the other direction, suppose that y 
belongs to T but not to 7. We may then construct the path q in Ax(K/T) in (13.11). 
Although this path is not a loop in A x (K/T), the path Y o q is a loop in Kreg, by 
Lemma 13.39. Since q is a lift of Y o q and q has distinct endpoints, Corollary 13.4 
tells us that Y o q is not null homotopic in Keg. Meanwhile, since Y o q homotopic 
in K tot > e”™ and m (Kreg) is isomorphic to mı (K) (Theorem 13.27), we 
conclude that t +> e?°™™” is not null homotopic in K. oO 


Summary. It is instructive to summarize the arguments in the proofs of 
Theorems 13.15 and 13.17, as described in this section and the previous three 
sections. We introduced the regular set and the singular set in K and we showed 
that the singular set has codimension at least 3. Using this, we proved a key result, 
that 71 (Kreg) is isomorphic to 7 (K). Next, we introduced the Stiefel diagram and 
the local diffeomorphism 


V : teg X (K/T) > Kreg. 


We found that the failure of injectivity of Y on teg x (K/T) is measured by the 
action of the group I’ W, and that the subgroup J xW of IF xW acts transitively on 
the alcoves. We concluded that if A is any alcove, the map Y : A x (K/T) > Kreg 
is a covering map. 

We then demonstrated that K/T is simply connected. We did this by mapping 
any loop [x(-)] in K/T to a loop / in K and constructing a homotopy l, of / to a 
point in K. Since 7; (Kreg) = mı (K), we can deform the homotopy Ls into Kreg. We 
can then use the covering map W to lift the homotopy l, to A x (K/T) and project 
back onto K/T to obtain a homotopy of [x(-)] to a point in K/T. After establishing 
that K/T is simply connected, we proved our first main result, that every loop in K 
is homotopic to a loop in T, as follows. We start with a loop / in K, push it down to 
K/T and then homotope it to the identity coset. We then lift this homotopy to K, 
giving a homotopy of / into T. 

Since 2, (T) is easily calculated, we conclude that every loop in K is homotopic 
to a loop in T of the form t +> e?”"”, for some y € IT. When y € J, we showed 
that t +> e?”"” is the image under a continuous homomorphism of a loop in the 
simply connected group SU(2), which shows that t +> e?”*” is null homotopic. 
When y ¢ J, we started with some H in A, translated by y, and then mapped back 
to some H’ € A by the action of J x W, where H is chosen so that H’ # H. We 
then constructed a path q in A x (K/T) with distinct endpoints such that Y o q is 
a loop in Kyeg and is homotopic in K to t b> e-""Y Since Y : A x (K/T) > Kreg 
is a covering map and q has distinct endpoints, VY o q is homotopically nontrivial in 
Kieg. Then since 71\(Kyeg) = mı (K), we concluded that t œ> e?TY is homotopically 
nontrivial in K. 
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13.8 The Center of K 


In this section, we analyze the center of K using the tools developed in the previous 
sections. If T is any one fixed maximal torus in K, Corollary 11.11 tells us that T 
contains the center Z(K) of K. We now give a criterion for an element t = e?”” of 
T to be in Z(K). 


Proposition 13.40. If H € t, then e?™” belongs to Z(K) if and only if 
(œ, H) eZ 


foralla € R. 


Proof. By Exercise 17 in Chapter 3, an element x of K is in Z(K) if and only if 
Ad,(X) = X forall X € &, or, equivalently, if and only if Ad, (X) = X for all 
X € g = tc. Now, if X € gy, then 


Adeu (X) = eH (X) = eH) y, 


Thus, Ad,27# acts as the identity on ga if and only if (œ, H} is an integer. Since g is 
the direct sum of tc (on which Ad,27# certainly acts trivially) and the gy’s, we see 
that e?”” belongs to Z(K) if and only if (œ, H} € Z for all a. E 


Definition 13.41. Let A C t denote the root lattice, that is, the set of all integer 
linear combinations of roots. Let A* denote the dual of the root lattice, that is, 


A* = {y € H |(à,y)} € Z, YVA € A}. 
Note that if y € t has the property that (a, y} € Z for every root «, then certainly 
(A, y} € Z whenever A is an integer combination of roots. Thus, Proposition 13.40 
may be restated as saying that e?" € Z(K) if and only if H € A*. Note also that 


if e?*4 = J, then certainly e?™” is in the center of K. Thus, the kernel T of the 
exponential map must be contained in A*. 


Proposition 13.42. The map 
ye e”, yet, 
is a homomorphism of A* onto Z(K) with kernel equal to T. Thus, 
Z(K) = A*/T, 


where A* is the dual of the root lattice and T is the kernel of the exponential. 
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Proof. As we have noted, Corollary 11.11 implies that Z(K) C T. Since the 
exponential map for T is surjective, Proposition 13.40 tells us that the map y t> 
e?™Y maps A* onto Z(K). This map is a homomorphism since t is commutative, 
and the kernel of the map is T C A*. oO 


Suppose, for example, that K = T. Then there are no roots, in which case the 
dual of root lattice is all of t. In this case, we have 


Z(K) = Z(T) =t/T =T. 


On the other hand, if g is semisimple, both A* and T will be discrete subgroups of 
t that span t, in which case, A*/T will be finite. 

Note that we have several different lattices inside t. Some of these “really” live 
in t* and only become subsets of t when we use the inner product to identify t* 
with t. Other lattices naturally live in t itself. The lattices that really live in t* 
are the root lattice, the lattice of analytically integral elements, and the lattice of 
algebraically integral elements. Meanwhile, the lattices that naturally live in t are 
the coroot lattice, the kernel of the exponential map, and the dual of the root lattice. 
Note that there is a duality relationship between the lattices in t and the lattices in t*. 
An element is algebraically integral if and only if its inner product with each coroot 
is an integer; thus, the lattice of algebraically integral elements and the coroot lattice 
are dual to each other. Similarly, the lattice of analytically integral elements is dual 
to the kernel of the exponential map. Finally, A* is, by definition, dual to the root 
lattice A. 

The lattices in t* are included in one another as follows: 


(root lattice) C (analytically integral elements) 


C (algebraically integral elements). (13.12) 
The dual lattices in t are then included in one another in the reverse order: 


(coroot lattice) C (kernel of exponential) 


C (dual of root lattice). (13.13) 


In light of Proposition 13.42 and Corollary 13.18, we have the following isomor- 
phisms involving quotients of lattices in (13.13): 


(kernel of exponential) /(coroot lattice) = mı (K) 


and 


(dual of root lattice) / (kernel of exponential) = Z(K). 
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Corollary 13.43. Let A* denote the dual of the root lattice and let I denote the 
coroot lattice. If K is simply connected, then 


Z(K) = A*/I. 
On the other hand, if Z(K) is trivial, then 
w(K) = A*/I. 


Let us define the adjoint group associated to K to be the image of K under 
the adjoint representation Ad : K — GL(t). (Since K is compact, the adjoint 
group of K is compact and thus closed.) If K is semisimple, then the center of € 
is trivial, which means that the Lie algebra version of the adjoint representation, 
ad : € — gl(€) is faithful. Thus, if g is semisimple, the Lie algebra of the adjoint 
group is isomorphic to the Lie algebra € of K itself. On the other hand, it is not hard 
to check (still assuming that g is semisimple and thus that the Lie algebra of the 
adjoint group is isomorphic to €) that the center of the adjoint group is trivial. Thus, 
whenever g is semisimple, we can construct a new group K’ where the first part of 
the corollary applies: 


m(K’) = A*/T. 


Proof. If mı (K) is trivial, the kernel T of the exponential map must equal the coroot 
lattice 7, which means that 


Z(K) = A*/T = A*/I. 


Meanwhile, if Z(K) is trivial, then the kernel I" of the exponential must equal the 
dual A* of the root lattice, which means that 


m(K) = T/I = A*/I, 


as claimed. oO 


Example 13.44. If K = SO(4), then the lattices A*, T, and 7 are as in Figure 13.7 
and both xı (K) and Z(K) are isomorphic to Z/2. Explicitly, Z(SO(4)) = {7, —T}. 


Proof. If we compute as in Sect. 7.7.2, but adjusting for a factor of i to obtain the 
real roots and coroots, we find that the coroots are the matrices 


Oa 

—a 0 

13.14 
obp an 
—b 0 


402 13 Fundamental Groups of Compact Lie Groups 


° e e e 
© © e —_ dual of root lattice 
° e 
O kernel of exponential 
° e 
© © © corootlattice 


© © 


Fig. 13.7 The lattices for SO(4), with the coroots indicated by arrows 


where a = +1 and b = +1. We identify the coroots with the vectors 
(a,b) = (+1, +1) in R?. The coroot lattice 7 will then consist of pairs (m,n) € Z? 
for which m + n is even. The kernel I of the exponential, meanwhile, is easily seen 
to consist of all pairs (m,n) € 2. Finally, since the coroots have been normalized to 
have length ./2, the roots and coroots coincide. Thus, the dual A* of the root lattice 
is the set of vectors (x, y) having integer inner product with (1, 1) and (1, —1), that 
is, such that x + y and x — y are integers. Thus, either x and y are both integers, 
or x and y are both integer-plus-one-half. It is then easy to check that both '/J and 
A*/T are isomorphic to Z/2. 

If H is any coroot, then H/2 belongs to A* but not to T. Thus, the unique 
nontrivial element of Z(SO(4)) may be computed as e?7'4/?), Direct calculation 
with (13.14) then shows that this unique nontrivial element is —7 € SO(4). oO 


Example 13.45. Suppose that K is a connected, compact matrix Lie group with Lie 
algebra €, that t is a maximal commutative subalgebra of £, and that the root system 
of £ relative to t is isomorphic to G2. Then both xı (K) and Z(K) are trivial. 


It turns out that for compact groups with root systems of type An, Bn, Cn and 
D,,, none of them is simultaneously simply connected and center free. Among the 
exceptional groups, however, the groups with root systems G2, F4, and Eg all have 
this property. 


Proof. As we can see from Figure 8.11, each of the fundamental weights for Go is 
a root. Thus, every algebraically integral element for G2 (i.e., every integer linear 
combination of the fundamental weights) is in the root lattice. Thus, all three of 
the lattices in (13.12) must be equal. By dualizing this observation, we see that all 
three of the lattices in (13.13) must also be equal. Thus, both mı(K) = T/I and 
Z(K) & A*/T are trivial. o 
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13.9 Exercises 


In Exercises 1—4, we consider the maximal commutative subalgebra t of the relevant 
Lie algebra € given in Sects. 7.7.1-7.7.4. In each case, we identify the Cartan 
subalgebra h = tc with C” by the map given in those sections, except that we 
adjust the map by a factor of i, so that t maps to R”. We also consider the real roots 
and coroots, which differ by a factor of i from the roots and coroots in Chapter 7. 


1. For the group SU(n),n > 1, show that coroot lattice J consists of all integer 
n-tuples (k,,...,k,) for which kı + --- +k, = 0. Show that the kernel T of 
the exponential is the same as the coroot lattice. 

2. For the group SO(2n),n > 2, show that the coroot lattice J is the set of integer 
linear combinations of vectors of the form +e; + eg, with j # k. Conclude 
that the coroot lattice consists of all integer n-tuples (k,,k2,...,k,) for which 
kı + +- + kn is even. Show that the kernel T of the exponential consists of all 
integer n-tuples and that [1/7 = Z/2. 

3. For the group SO(2n + 1),n > 1, show that the coroot lattice J is the set of 
integer linear combinations of vectors of the form +2e; and te; + eg, with 
j + k. Conclude that, as for SO(27), the coroot lattice consists of all integer 
n-tuples (k1, k2, ... , kn) for which kı +--+- + kn is even and the kernel T of the 
exponential consists of all integer 7-tuples. 

4. For the group Sp(), > 1, show that the coroot lattice J is the set of integer 
linear combinations of vectors of the form +e; and te; + ex, with j # k. 
Conclude that both the coroot lattice and the kernel I of the exponential consist 
of all integer n-tuples. 

5. Verify the claims in Example 13.19. 

Hints: In the case of SO(5), make use of the calculations in the proof of 
Example 12.13. In the case of PSU(3), note that X € psu(3) = su(3) 
exponentiates to the identity in PSU(3) if and only if X exponentiates in SU(3) 
to a constant multiple of the identity. 

6. Using Theorems 13.15 and 13.17 and the results of Exercises 2 and 3, show 
that every homotopically nontrivial loop in SO(n), n > 3, is homotopic to the 
loop 


cos @ — sin 0 
sinô cos 


(This result also follows from the inductive argument using fiber bundles, as 
discussed following the proof of Proposition 13.10.) 

7. Let G be a connected matrix Lie group. Using the following outline, show 
that zı (G) is commutative. Let A(-) and B(-) be any two loops in G based at 
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the identity. Construct two families of loops a,(-) (defined in terms of A(-)) 
and 6,(-) (defined in terms of B(-)) with the property that ao(t)Bo(t) is the 
concatenation of A with B and a@;(t)f,(t) is the concatenation of B with A. 
(Here the product of, say, @(t)Bo(t) is computed in the group G.) 


. Suppose G; and G, are connected matrix Lie groups with Lie algebras gı 


and go, respectively, and that ® : G; — Gp is a Lie group homomorphism. 
Show that if the associated Lie algebra homomorphism ọ : gı — gp is an 
isomorphism, then ® is a covering map (Definition 13.1). 


. Show that Proposition 13.26 fails if we do not assume that H’ and H are in treg. 
. Let E be a real Euclidean space. Suppose V C E is the hyperplane through the 


origin orthogonal to a nonzero vector œ. Now suppose L is the hyperplane (not 
necessarily through the origin) obtained by translating V by ca. Lets : V > V 
be the affine transformation given by 


(a, v) 


S(v) =v ra 


a+2ca. 


Show that if v € L and d € R, then 
s(v + dæ) = v — da. 
That is to say, s is the reflection about L. 


Let R* be the set of positive roots associated to a particular base A, and let 
Q1,...,Q@y be the positive roots. 


(a) Show that for each alcove A, there are integers nı, ...,ny such that 
A= {Het|nj <(aj,H)<nj +1, j=1,...,N}. (13.15) 
(b) If m,,...,y is any sequence of integers, show that if the set A in (13.15) 


is nonempty, then this set is an alcove. 


In this exercise, we assume g := fc is simple. Let Rt be the positive roots 
associated to a particular base A, and let C be the fundamental Weyl chamber 
associated to Rt. Let A be the alcove containing the “bottom” of C, that is, 
such that all very small elements of C are in A. Let 6 be the highest root, that 
is, the highest weight for the adjoint representation of g, which is irreducible 
by assumption. 


(a) Show that A may be described as 
A= {H €t|0< (a,H) <1, Yæ e Rt}. 


(Compare Exercise | 1.) 
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A By G 


Fig. 13.8 The extended Dynkin diagrams associated to A2, B2, and G2 


13. 


14. 


15. 


(b) Show that A may also be described as 
A={H €t|(B,H) <1, (œ, H) >0, Yæ € A}. 


If g = €c is simple, the alcove A in Exercise 12 is a simplex, that is, a bounded 
region in t = R* defined by k + 1 linear inequalities. By the same argument 
as in the proof of Proposition 8.24, the extended Weyl group is generated by 
the k + 1 reflections about the walls of A. The structure of A (and thus of 
the extended Weyl group) can be captured in the extended Dynkin diagram, 
defined as follows. The diagram has k + 1 vertices, representing the elements 
1,...,@% and —B. We then define edges and arrows by the same rules as for 
ordinary Dynkin diagrams (Sect. 8.6). (Note that —6 is at an obtuse angle with 
each element of A.) 

Verify the extended Dynkin diagrams for A2, B2, and G% in Figure 13.8, 

where in each diagram, the black dot indicates the “extra” vertex —B added to 
the ordinary Dynkin diagram. (Refer to Figures 13.3, 13.4, and 13.5.) 
Let K be a connected compact matrix Lie group with Lie algebra € and let t be 
a maximal commutative subalgebra of £. Let A C t be an alcove and suppose 
A has no nontrivial symmetries. (That is to say, suppose there is no nontrivial 
element of the Euclidean group of t mapping A onto A.) 


(a) Show that K is simply connected. 
(b) Show that if K’ is any connected matrix Lie group whose Lie algebra is 
isomorphic to £, then K’ is isomorphic to K. 


Note: It is known that there exists a connected compact group K of rank 2 
whose root system is G2. Since the each alcove for G% is a triangle with 3 
distinct edge lengths, Part (b) of the problem shows that any such group K 
must be simply connected. (Compare Example 13.45.) 

Consider the quotient space T/W, that is, the set of orbits of W acting on T. 
Let A C t be any one alcove and let A be the closure of A in t. Show that if K 
is simply connected, then the map H +> [e?”"], where [t] denotes the W-orbit 
of t, is a bijection between A and T/W. 

Hint: Imitate the proof of Proposition 8.29 with the Weyl group W replaced by 
the extended Weyl group J x W and the chamber C replaced by the alcove A. 


Appendix A 
Linear Algebra Review 


In this appendix, we review results from linear algebra that are used in the text. 
The results quoted here are mostly standard, and the proofs are mostly omitted. For 
more information, the reader is encouraged to consult such standard linear algebra 
textbooks as [HK] or [Axl]. Throughout this appendix, we let M,,(C) denote the 
space of n x n matrices with entries in C. 


A.1 Ejigenvectors and Eigenvalues 


For any A € M,,(C), anonzero vector v in C” is called an eigenvector for A if there 
is some complex number A such that 


Av = dv. 


An eigenvalue for A is a complex number A for which there exists a nonzero v € 
C” with Av = Av. Thus, A is an eigenvalue for A if the equation Av = Av or, 
equivalently, the equation 


(A—ADv =0, 


has a nonzero solution v. This happens precisely when A — AJ fails to be invertible, 
which is precisely when det(A — AJ) = 0. For any A € M,,(C), the characteristic 
polynomial p of A is given by 


P(A) = det(A—-AI), AEC. 
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This polynomial has degree n. In light of the preceding discussion, the eigenvalues 
are precisely the zeros of the characteristic polynomial. 

We can define, more generally, the notion of eigenvector and eigenvalue for any 
linear operator on a vector space. If V is a finite-dimensional vector space over C (or 
over any algebraically closed field), every linear operator on V will have a least one 
eigenvalue. If A is a linear operator on a vector space V and A is an eigenvalue for 
A, the A-eigenspace for A, denoted V}, is the set of all vectors v € V (including the 
zero vector) that satisfy Av = Av. The A-eigenspace for A is a subspace of V. The 
dimension of this space is called the multiplicity of à. (More precisely, this is the 
“geometric multiplicity” of À. In the finite-dimensional case, there is also a notion 
of the “algebraic multiplicity” of A, which is the number of times that À occurs as a 
root of the characteristic polynomial. The geometric multiplicity of A cannot exceed 
the algebraic multiplicity). 


Proposition A.1. Suppose that A is a linear operator on a vector space V and 
V1,...+, Ug are eigenvectors with distinct eigenvalues À1,..., Apg. Then v1,..., UK 
are linearly independent. 


Note that here V does not have to be finite dimensional. 


Proposition A.2. Suppose that A and B are linear operators on a finite- 
dimensional vector space V and suppose that AB = BA. Then for each eigenvalue 
à of A, the operator B maps the i-eigenspace of A into itself. 


Proof. Let À be an eigenvalue of A and let V} be the A-eigenspace of A. Then let v 
be an element of V} and consider Bv. Since B commutes with A, we have 


A(By) = BAv = AB, 


showing that Bv is in V}. oO 


A.2 Diagonalization 


Two matrices A,B € M,C) are said to be similar if there exists an invertible 
matrix C such that 


A = CBC”. 


The operation B —> CBC™! is called conjugation of B by C. A matrix is said 
to be diagonalizable if it is similar to a diagonal matrix. A matrix A € M,,(C) is 
diagonalizable if and only if there exist n linearly independent eigenvectors for A. 
Specifically, if vj,..., Vn are linearly independent eigenvectors, let C be the matrix 
whose kth column is vg. Then C is invertible and we will have 
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A 
A=C Ps co, (A.1) 
Àn 


where À1,..., An are the eigenvalues associated to the eigenvectors v1, ..., Un, in 
that order. To verify (A.1), note that C maps the standard basis element e; to vj. 
Thus, C7! maps v; to e;, the diagonal matrix on the right-hand side of (A.1) then 
maps ej to A;e;, and C maps Aj;e; to A;v;. Thus, both sides of (A.1) map vj to 
Ajv;, forall j. 

If A € M,,(C) has n distinct eigenvalues (i.e., n distinct roots to the characteristic 
polynomial), A is necessarily diagonalizable, by Proposition A.1. If the characteris- 
tic polynomial of A has repeated roots, A may or may not be diagonalizable. 

For A € M,,(C), the adjoint of A, denoted A*, is the conjugate-transpose of A, 


(A*)j¢ = Ay. (A.2) 


A matrix A is said to be self-adjoint (or Hermitian) if A* = A. A matrix A is 
said to be skew self-adjoint (or skew Hermitian) if A* = —A. A matrix is said to 
be unitary if A* = A~!. More generally, A is said to be normal if A commutes 
with A*. If A is normal, A is necessarily diagonalizable, and, indeed, it is possible 
to find an orthonormal basis of eigenvectors for A. In such cases, the matrix C in 
(A.1) may be taken to be unitary. 

If A is self-adjoint, all of its eigenvalues are real. If A is real and self-adjoint 
(or, equivalently, real and symmetric), the eigenvectors may be taken to be real as 
well, which means that in this case, the matrix C may be taken to be orthogonal. If 
A is skew, then its eigenvalues are imaginary. If A is unitary, then its eigenvalues 
are complex numbers of absolute value 1. 

We summarize the results of the previous paragraphs in the following. 


Theorem A.3. Suppose that A € M,(C) has the property that A* A = AA*, (e.g., 
if A* = A, A* = A, or A* = —A). Then A is diagonalizable and it is possible 
to find an orthonormal basis for C” consisting of eigenvectors for A. If A* = A, 
all the eigenvalues of A are real; if A* = —A, all the eigenvalues of A are pure 
imaginary; and if A* = A™', all the eigenvalue of A have absolute value 1. 


A.3 Generalized Eigenvectors and the SN Decomposition 


Not all matrices are diagonalizable, even over C. If, for example, 


11 
a(i (A.3) 
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then the only eigenvalue of A is 1, and every eigenvector with eigenvalue 1 is of the 
form (c, 0). Thus, we cannot find two linearly independent eigenvectors for A. It is 
not hard, however, to prove the following result. 


Theorem A.4. Every matrix is similar to an upper triangular matrix. Every 
nilpotent matrix is similar to an upper triangular matrix with zeros on the diagonal. 


While Theorem A.4 is sufficient for some purposes, we will in general need 
something that comes a bit closer to a diagonal representation. If A € M,(C) 
does not have n linearly independent eigenvectors, we may consider the more 
general concept of generalized eigenvectors. A nonzero vector v € C” is called a 
generalized eigenvector for A if there is some complex number À and some positive 
integer k such that 


(A—Al)‘v = 0. (A.4) 


If (A.4) holds for some v Æ 0, then (A — AJ) cannot be invertible. Thus, the number 
A must be an (ordinary) eigenvalue for A. However, for a fixed eigenvalue À, there 
may be generalized eigenvectors v that are not ordinary eigenvectors. In the case 
of the matrix A in (A.3), for example, the vector (0, 1) is a generalized eigenvector 
with eigenvalue 1 (with k = 2). 

It can be shown that every A € M,,(C) has a basis of generalized eigenvectors. 
For any matrix A and any eigenvalue A for A, let W, be the generalized eigenspace 
with eigenvalue 1: 


W, = {v € C” |(A—Ar)‘v = 0 for some k }. 


Then C” decomposes as a direct sum of the W,’s, as A ranges over all the 
eigenvalues of A. Furthermore, the subspace W, is easily seen to be invariant 
under the matrix A. Let A, denote the restriction of A to the subspace W,, and 
let N} = A, — AI, so that 


Ay = AI + Ny. 
Then Nj is nilpotent; that is, NV. hi = 0 for some positive integer k. We summarize 


the preceding discussion in the following theorem. 


Theorem A.5. Let A be ann x n complex matrix. Then there exists a basis for C” 
consisting of generalized eigenvectors for A. Furthermore, C” is the direct sum of 
the generalized eigenspaces W. Each W; is invariant under A, and the restriction 
of A to W, is of the form AI + Ny, where Nj is nilpotent. 


The preceding result is the basis for the following decomposition. 


Theorem A.6. Each A € M,(C) has a unique decompositionas A = S +N where 
S is diagonalizable, N is nilpotent, and SN = NS. 
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The expression A = S + N, with S and N as in the theorem, is called the 
SN decomposition of A. The existence of an SN decomposition follows from the 
previous theorem: We define S to be the operator equal to AJ on each generalized 
eigenspace W, of A and we set N to be the operator equal to N, on each Wj. For 
example, if A is the matrix in (A.3), then we have 


s= (o1). = (00), 


A.4 The Jordan Canonical Form 


The Jordan canonical form may be viewed as a refinement of the SN decomposition, 
based on a further analysis of the nilpotent matrices N, in Theorem A.5. 


Theorem A.7. Every A € M,(C) is similar to a block-diagonal matrix in which 
each block is of the form 


à 1 
a 
ai 
À 


Two matrices A and B are similar if and only if they have precisely the same Jordan 
blocks, up to reordering. 


There may be several different Jordan blocks (possibly of different sizes) for the 
same value of À. In the case in which A is diagonalizable, each block is 1 x 1, in 
which case, the ones above the diagonal do not appear. Note that each Jordan block 
is, in particular, of the form AJ + N, where N is nilpotent. 


A.5 The Trace 


For A € M,,(C), we define the trace of A to be the sum of the diagonal entries of A: 
trace(A) = 5 Akk. 
k=1 


Note that the trace is a linear function of A. For A, B € M, (C), we note that 


n n 


trace(AB) = X` (AB)u = X Y Au Bu. (A.5) 
k=1 


k=1 l=1 
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If we similarly compute trace(BA), we obtain the same sum with the labels for the 
summation variables reversed. Thus, 


trace(AB) = trace(BA). (A.6) 
If C is an invertible matrix and we apply (A.6) to the matrices CA and C—!, we have 
trace(CAC~!) = trace(C~!CA) = trace(A); 


that is, similar matrices have the same trace. 

More generally, if A is a linear operator on a finite-dimensional vector space V, 
we can define the trace of A by picking a basis and defining the trace of A to be the 
trace of the matrix that represents A in that basis. The above calculations show that 
the value of the trace of A is independent of the choice of basis. 


A.6 Inner Products 


Let (-,-) denote the standard inner product on C”, defined by 


where we follow the convention of putting the complex-conjugate on the first factor. 
We have the following basic result relating the inner product to the adjoint of a 
matrix, as defined in (A.2). 


Proposition A.8. For all A € M,(C), the adjoint A* of A has the property that 
(x, Ay) = (A*x, y) (A.7) 


forallx,y € C”. 


Proof. We compute that 


= LD Agri Yee 


j=1k=1 


This last expression is just the inner product of A*x with y. oO 
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We may generalize the notion of inner product as follows. 


Definition A.9. If V is any vector space over C, an inner product on V is a map 
that associates to any two vectors u and v in V a complex number (u, v) and that 
has the following properties: 


1. Conjugate symmetry: (v, u} = (u, v} forall u, v € V. 

2. Linearity in the second factor: (u, vı + av) = (u,vi) + a (u, v2), for all 
u,v, v2 E V anda € C. 

3. Positivity: For all v € V, the quantity (v, v) is real and satisfies (v, v) > 0, with 
(v,v) = 0 only if v = 0. 


Note that in light of the conjugate-symmetry and the linearity in the second 
factor, an inner product must be conjugate-linear in the first factor: 


(vi + av2,u) = (vi, u) +4 (v2, u). 


(Some authors define an inner product to be linear in the first factor and conjugate 
linear in the second factor.) An inner product on a real vector space is defined in the 
same way except that conjugate symmetry is replaced by symmetry ((v, u) = (u, v)) 
and the constant a in Point 2 now takes only real values. 

If V is a vector space with inner product, the norm of a vector v € V, denoted 
||v|| , is defined by 


lull = y (v, v). 


The positivity condition on the inner product guarantees that ||v|| is always a non- 
negative real number and that ||v|| = 0 only if v = 0. If, for example, V = M,,(C), 
we may define the Hilbert-Schmidt inner product by the formula 


(A, B) = trace(A* B). (A.8) 


It is easy to see check that this expression is conjugate symmetric and linear in the 
second factor. Furthermore, we may compute as in (A.5) that 


trace(A* A) = Ay An = = |Aul? = 0, 
k I=1 k=l 


and the sum is zero only if each entry of A is zero. The associated Hilbert-Schmidt 
norm satisfies 


n 


AI? = So Axl. 


kl=1 
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Suppose that V is a finite-dimensional vector space with inner product and that 
W is a subspace of V. Then the orthogonal complement of W, denoted W+, is 
the set of all vectors v in V such that (w, v) = 0 for all w in W. The space V then 
decomposes as the direct sum of W and W+. 

We now introduce the abstract notion of the adjoint of a matrix. 


Proposition A.10. Let V be a finite-dimensional vector space with an inner 


product {-,-). If A is a linear map from V to V, there is a unique operator 
A* : V — V such that 


(u, Av) = (A*u, v) 


forallu,v € V. Furthermore, if W is a subspace of V that is invariant under A, 
then W+ is invariant under A*. 


A.7 Dual Spaces 


If V is a vector space over C, a linear functional on V is a linear map of V into 
C. If vi,..., Un is a basis for V, then for each set of constants a1,...,an, there 
is a unique linear functional ¢ such that (vk) = ax. If V is a finite-dimensional 
complex vector space, then the dual space to V, denoted V*, is the set of all linear 
functionals on V. The dual space is also a vector space and its dimension is the same 
as that of V. 

If W is a subspace of a vector space V, the annihilator subspace of W, denoted 
W^, is the set of all ¢ in V* such that @(w) = 0 for all w in W. Then W^ is a 
subspace of V*. If V is finite dimensional, then 


dim W + dim W^ = dim V 


and the map W —> W^ provides a one-to-one correspondence between subspaces 
of V and subspaces of V*. 

In general, one should be careful to distinguish between a vector space and its 
dual. Nevertheless, when V is finite dimensional and has an inner product, we can 
produce an identification between V and V*. 


Proposition A.11. Let V be a finite-dimensional inner product space and let @ be 
a linear functional on V. Then there exists a unique w € V such that 


(v) = (w, v) 


forallv € V. 


Recall that we follow the convention that inner products are linear in the second 
factor, so that (w, v) is, indeed, linear in v. 
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A.8 Simultaneous Diagonalization 


We now extend the notion of eigenvectors and diagonalization to families of linear 
operators. 


Definition A.12. Let V be a vector space and let A be a collection of linear 
operators on V. A nonzero vector v € V is a simultaneous eigenvector for A if 
for all A € A, there exists a constant À 4 such that Av = A4v. The numbers A 4 are 
the simultaneous eigenvalues associated to v. 


Consider, for example, the space D of all diagonal n x n matrices. Then for each 
k = 1,...,n, the standard basis element e% is a simultaneous eigenvector for D. 
For each diagonal matrix A, the simultaneous eigenvalue associated to ex is the kth 
diagonal entry of A. 


Proposition A.13. Jf A is a commuting family of linear operators on a finite- 
dimensional complex vector space, then A has at least one simultaneous eigen- 
vector. 


It is essential here that the elements of A commute; noncommuting families of 
operators typically have no simultaneous eigenvectors. 

In many cases, the collection A of operators on V is a subspace of End(V), 
the space of all linear operators from V to itself. In that case, if v is a simultaneous 
eigenvector for A, the eigenvalues A 4 for v depend linearly on A. After all, if A} v = 
Àv and Azv = Anu, then 


(A, + cA2)v = (A, + cAg)v. 


The preceding discussion leads to the following definition. 


Definition A.14. Suppose that V is a vector space and A is a vector space of linear 
operators on V. A weight for A is a linear functional u on A such that there exists 
a nonzero vector v € V satisfying 


Av = u(A)v 


for all A in A. For a fixed weight u, the set of all vectors v € V satisfying Av = 
u(A)v for all A in A is called the weight space associated to the weight jz. 


That is to say, a weight is a set of simultaneous eigenvalues for the operators in 
A. If V is finite dimensional and the elements of A all commute with one another, 
then there will exist at least one weight for A. 

If A is finite dimensional and comes equipped with an inner product, it is 
convenient to express the linear functional jz in Definition A.14 as the inner product 
of A with some vector, as in Proposition A.11. From this point of view, we define 
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a weight to be an element u of A (not A*) such that there exists a nonzero v in V 
with 


Av = (u, A) v 


forall A € A. 


Definition A.15. Suppose that V is a finite-dimensional vector space and A is 
some collection of linear operators on V. Then the elements of A are said to be 
simultaneously diagonalizable if there exists a basis v1,...,U, for V such that 
each vx is a simultaneous eigenvector for A. 


If A is a vector space of linear operators on V, then saying that the elements of A 
are simultaneously diagonalizable is equivalent to saying that V can be decomposed 
as a direct sum of weight spaces of A. 

If a collection A of operators is simultaneously diagonalizable, then the elements 
of A must commute, since they commute when applied to each vg. Conversely, if 
each A € A is diagonalizable by itself and if the elements of A commute, then 
(it can be shown), the elements of A are simultaneously diagonalizable. We record 
these results in the following proposition. 


Proposition A.16. Jf A is a commuting collection of linear operators on a finite- 
dimensional vector space V and each A € A is diagonalizable, then the elements 
of A are simultaneously diagonalizable. 


We close this appendix with an analog of Proposition A.1 for simultaneous 
eigenvectors. 


Proposition A.17. Suppose V is a vector space and A is a vector space of linear 
operators on V. Suppose [L1,..., Um are distinct weights for A and v\,...,Um are 
elements of the corresponding weight spaces. If vi +++++ Um = 0, then v; = 0 for 
all j =1,...,m. Furthermore, if vı +++++ Um is a weight vector with weight n, 
then u = uj for some j and vy = O forall k F j. 


Since this result is not quite standard, we provide a proof. 


Proof. Assume first that vı ++: -+ Um = 0, with v; in the weight space with weight 
Hj. Ifm = 1, then we have vı = 0, as claimed. If m > 1, choose some A € A such 
that uı (A) A [2(A). If we then apply the operator A — f42(A)/ to vi +--+ + Um, 
we obtain 


0 = $ (u; (4) — m(A))vj. (A.9) 


j=l 


Now, the j = 2 term in (A.9) is zero, so that the sum actually contains at most m — 1 
nonzero terms. Thus, by induction on m, we can assume that each term in (A.9) is 
zero. In particular, (u (A) — u2(4))vı = 0, which implies (by our choice of A) that 
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vı = 0. Once vı is known to be zero, the original sum vı +--+ + Um contains at 
most m — 1 nonzero terms. Thus, using induction on m again, we see that each term 
in the sum is zero. 

Assume now that v := vı +--+: + Um is a (nonzero) weight vector with some 
weight jz, and choose some j for which v; 4 0. Then for each A € A, we have 


0 = Av — p(A)v = X (m (A) ~ H(A) 0K. 
k=1 


Thus, by the first part of the proposition, we must have (ug (A) — u(A))vk = 0 for 
all j. Taking k = j, we conclude that ~;(A) — “(A) = 0. Since this result holds 
for all A € A, we see that y = uj. Finally, for any k Æ j, we can choose A € A 
so that wx(A) A uj;(A). With this value of A (and with u = uj), the fact that 
(Ux (A) — 4; (A))vz = 0 forces vg to be zero. o 


Appendix B 
Differential Forms 


In this section, we give a very brief outline of the theory of differential forms on 
manifolds. Since this is all we require, we consider only top-degree forms, that k- 
forms on k-dimensional manifolds. See Chapter 16 in [Lee] for more information. 
We begin by considering forms at a single point, which is just a topic in linear 
algebra. 


Definition B.1. If V is a k-dimensional real vector space, a map a : VE — R is 
said to be k-linear and alternating if (1) a(v;,..., vg) is linear with respect to vj 
with each the other variables fixed, and (2) a changes sign whenever any two of the 
variables are interchanged: 


L(V], ..., Uly eee Uy ee Uk) = —Q (V1, ..., Ums... VI... Uk). 


It is a standard result in linear algebra (e.g., Theorem 2 in Section 5.3 of [HK]) 
that every k-dimensional real vector space admits a nonzero k-linear, alternating 
form, and that any two such forms differ by multiplication by a constant. If T : 
V — V is a linear map and a is a k-linear, alternating form on V, then for any 
V1,..., UK E€ V, we have 


a(Tv,,..., Tvk) = (det T)a(v1,..., Vk). (B.1) 


If vi,..., Vg and w1,..., wk are two ordered bases for V, then there is a unique 
invertible linear transformation T : V — V such that Tv; = w;. We may divide the 
collection of all ordered bases of V into two groups, where two ordered bases belong 
to the same group if the linear map relating them has positive determinant and the 
two bases belong to different groups if the linear map relating them has negative 
determinant. An orientation of V is then a choice of one of the two groups of bases. 
Once an orientation of V has been chosen, we say that a basis is positively oriented 
if it belongs to the chosen group of bases. If œ is a nonzero k-linear, alternating form 


© Springer International Publishing Switzerland 2015 419 
B. Hall, Lie Groups, Lie Algebras, and Representations, Graduate 
Texts in Mathematics 222, DOI 10.1007/978-3-3 19-13467-3 


420 B Differential Forms 


on V, we can define an orientation of V by decreeing an ordered basis v1, ..., vg to 
be positively oriented if w(v1,..., vg) > 0. 

The following example of a k-linear, alternating form on R* will help motivate 
the notion of a k-form. For any vectors v;,..., vg in R*, define the parallelepiped 
v, spanned by these vectors as follows: 


es ye = {C101 +--+ + ce v~|0 < cy < 1}. (B.2) 
(If k = 2, then a parallelepiped is just a parallelogram.) Let us use the orientation 
on RÝ in which the standard basis €1,...,k is positively oriented. 


Example B.2. Define a map V : (R*)* —> R by 
V(ur,..., UK) = EVOM(Py,....x4); (B.3) 


where we take a plus sign if vı, ..., vg is a positively oriented basis for R and a 
minus sign if it is a negatively oriented basis. Then V is a k-linear, alternating form 
on R4. 


Note that the volume of P,,...y, is zero if vj,..., vg do not form a basis for 
R*, in which case, we do not have to worry about the sign on the right-hand side 
of (B.3). Now, it is known that the volume of P,,...», is equal to |det T|, where T 
is the k x k matrix whose columns are the vectors v1, ..., vg. This claim is a very 
special case of the change-of-variables theorem in multivariate calculus and can be 
proved by expressing T as a product of elementary matrices. We can then see that 
V(v,..., Ug) is equal to det T (without the absolute value signs). Meanwhile, it is a 
standard result from linear algebra that the determinant of T is a k-linear, alternating 
function of its column vectors v1,..., Ug. 

We now turn to a discussion of top-degree forms on manifolds. If M is a k- 
dimensional manifold (say, embedded into some RY), we have the notion of the 
tangent space to M at m, denoted TM, which is a k-dimensional subspace of R”. 


Definition B.3. Suppose M is a smoothly embedded, k-dimensional submanifold 
of RY for some k, N. A k-form a on M is a smoothly varying family œm of k-linear, 
alternating maps on Tm M, one for each m € M. 


To be precise, let us say that a family œm of k-linear, alternating forms on each 
TinM is “smoothly varying” if the following condition holds. Suppose X1,..., Xx 
are smooth RY -valued functions on RY with the property that foreach m € M, the 
vector X ;(m) is tangent to M. Then the function a,(X\(m),..., X¢(m)),m € M, 
should be a smooth function on M. 

The “purpose in life” of a k-form @ on a k-dimensional manifold M is to 
be integrated over regions in M. More precisely, we must assume that M is 
orientable—meaning that it is possible to choose an orientation of each tangent 
space T;, M in a way that varies continuously with m—and that we have chosen 
an orientation on M. Then if E is a “nice” subset of M (to be precise, a compact 
k-dimensional submanifold with boundary), there is a notion of the integral of a 
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Fig. B.1 The integral of œ over the small region F C M is approximately equal to a(v1, v2) 


over E C M, denoted 


fe 


The value of f g& may be thought of as assigning a sort of (possibly negative) 
“volume” to the set E. If œ is a k-form on M and f : M — R is a smooth 
function, then fa is also a k-form on M, which may also be integrated, using the 
same orientation we used to integrate œ. 

We may gain an intuitive understanding of the notion of integration of k-forms 
as follows. For any region E C M, we may think of chopping E up into very small 
subregions F, each of which is shaped like a parallelepiped (as in (B.2)). More 
specifically, each subregion will look like the parallelepiped spanned by tangent 
vectors U;,..., Ug at some point m € E, which we can arrange to be positively 
oriented. The idea is then that the integral of œ over each subregion should be 
approximately @,(v1,..., vg). (See Figure B.1.) The integral of œ over all of E 
should then be the sum of its integrals over the subregions. 

If we think of tn as a sort of volume of the set E, then a@,(v1,..., Ux) 
represents the volume (possibly with a minus sign) of a small parallelepiped-shaped 
subregion inside Æ. Example B.2 then makes it natural that we should require 


Qm(V1,..., Ug) to be k-linear and alternating. 

We may give a more precise definition of the integral of a differential form as 
follows. We choose a local coordinate system x),...,X, on our oriented manifold 
M, defined in some open set U. We then let 0/0x,,..., 0/0x, denote the associated 


basis for the tangent space at each point. (In coordinates, 0/ðx; is the unit vector 
in the x-direction.) We assume the coordinate system is “oriented,” meaning that 
0/0x1,...,0/0x,x is an oriented basis for the tangent space at each point in U. 
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Definition B.4. Let œ be a k-form on a oriented k-dimensional manifold M and 
suppose E C M is a compact subset of the domain U of {x;}. We then define 


Jp a as 
ð 
fo =fa TE 3 =) dx, dx2 +++ dXk, (B.4) 


where the integral on the right-hand side of (B.4) is an ordinary integral in Euclidean 
space. 


The integral on the right-hand side of (B.4) may be defined as a Riemann integral 
or using Lebesgue measure on R*. A key point in the definition is to verify that the 
value of f, œ is independent of the choice of coordinates. To this end, suppose {yx} 
is another oriented coordinate system whose domain includes E. Then by the chain 
rule, we have 


= 2, of Ym m 
ar Vm Ox) 
for any smooth function f. That is to say, 


mn 
= 2 


Ox) = l 


Thus, if T is the matrix whose entries are Tim = 0y,,/0X;, we will have, by (B.1), 


ð 0 ð ð 
æ | —,..., — ] = (det T) eed ie 
Oxy Axx ay? Aye 
On the other hand, the classical change of variables theorem says that 
1 f(x1,..., Xk) dy dx2 +++ dx, = f fOis... Yk) J dy, dya +++ dyk, 
E E 
where J is the determinant of the matrix {0x,,/0y;}. (For example, in the k = 
case, J is just dx/dy, which is obtained by writing dx = (dx/dy) dy.) But by the 


chain rule again, the matrix {0x,, /0;} is the inverse of the matrix {dy,,/0x;}. Thus, 
J is the reciprocal of det T, and we see that 


fei : =) dx, dx2 +++ dxx 
E (3x Axx 
Sf a(i se) Di dh dy 
E \dy ” OVE 


as claimed. 
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Note that if we think of the integral in (B.4) as a Riemann integral, we 
compute the integral by covering E with small k-dimensional “rectangles,” and 
these rectangles may be thought of as being “spanned” by multiples of the vectors 
0/0x1,...,0/0x,. In the Riemann integral, the integral of œ over each small 
rectangle is being approximated by a(0/0x1,...,0/0xx) times the volume of the 
rectangle, in agreement with preceding intuitive description of the integral. 

If we wish to integrate a k-form @ over a general k-dimensional, compact subset 
E of M, we use a partition of unity to write œ as a sum of forms a;, each of which 
is supported in a small region in M. For each j, we choose a coordinate system 
defined on a set U; containing the support of œ ;. We then integrate a; over E N U; 
and sum over j. 


Appendix C 
Clebsch-Gordan Theory 
and the Wigner—Eckart Theorem 


C.1 Tensor Products of sl(2; C) Representations 


The irreducible representations of SU(2) (or, equivalently, of sl(2;C)) were classi- 
fied in Sect. 4.6 and may be realized in spaces of homogeneous polynomials in two 
complex variables as in Example 4.10. For each non-negative integer m, we have 
an irreducible representation (m, Vin) of SI(2;C) of dimension m + 1, and every 
irreducible representation of sl(2; C) is isomorphic to one of these. We are using 
here the mathematicians’ labeling of the representations; in the physics literature, 
the representations are labeled by the “spin” / := m/2. 

By the averaging method of Sect.4.4, we can find on each space Vn an inner 
product that is invariant under the action of the compact group SU(2). (In the case 
of Vi = C?, we can use the standard inner product on C? and for any mm, itis not hard 
to describe such an inner product explicitly.) With respect to such an inner product, 
the orthogonal complement of a subspace invariant under SU(2) (or, equivalently, 
under sl(2; C)) is again invariant under SU(2). Since the element H = diag(1,—1) 
of sl(2; C) is inisu(2), 2 (4) will be self-adjoint with respect to this inner product. 
Thus, eigenvectors of z,,() with distinct eigenvectors must be orthogonal. 

Recall from Sect. 4.3.2 the notion of the tensor product of representations of a 
group or Lie algebra. We consider this in the case of the irreducible representations 
of sl(2; C). We regard the tensor product Vn ® V, as a representation of sl(2; C). 
(Recall that it is also possible to view Vm ® V, as a representation of sl(2;C) ® 
sl(2; C).) The action of sl(2;C) on Vm ® V, is given by 


(Am @ Nn)(X) = W(X) @I +1 8 m,(X). (C.1) 


We compute in the standard basis {X, Y, H} for sl(2;C). Once we have chosen 
SU(2)-invariant inner products on Vm and V,,, there is a unique inner product 
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on Vm ® V, with the property that (u; ® v1, u2 ® v2) = (u1, u2) (v1, v2) . (This 
assertion can be proved using the universal property of tensor products.) The inner 
product on Vm ® V, is also invariant under the action of SU(2). We assume in the 
rest of this section that an inner product of this sort has been chosen on each Vn @ Vn. 

In general, Vm ® V, will not be an irreducible representation of sl(2;C); the 
goal of this section is to describe how Vm ® V, decomposes as a direct sum of 
irreducible invariant subspaces. This decomposition is referred to as the Clebsch- 
Gordan theory. Let us consider first the case of V; @ Vi, where Vj = C?, the 
standard representation of sI(2;C). If {e}, e2} is the standard basis for C?, then 
the vectors of the form ey @ e;,1 < k,l < 2, form a basis for C? @ C?. Since 
e and ez are eigenvalues for xı (H) with eigenvalues 1 and —1, respectively, then, 
by (C.1), the basis elements for C? @ C? are eigenvectors for the action of H with 
eigenvalues 2, 0,0, and —2, respectively. Since 2 is the largest eigenvalue for H, the 
corresponding eigenvector e} ® e; must be annihilated by X (i.e., by the operator 
m(X)@I+1@m(X)). 

If then we apply Y (i.e., by the operator mı (Y) ® J + I © m(Y)) repeatedly to 
ei ® e1, we obtain e1 ® e2 + e2 ® €], then 2e2 ® e2, and then the zero vector. The 
space spanned by these vectors is invariant under sl(2;C) and irreducible, and is 
isomorphic to the three-dimensional representation V2. The orthogonal complement 
of this space in C @c’, namely the span of e1 ® e2 — e2 ® ej, is also invariant, and 
sl(2; C) acts trivially on this space. Thus, 


C 8 C = span{e, Q €1,€1 Q e2 + e2 Q £1, €2 Q e2} ® span{e; Q ez — e2 Q ey}. 


We see, then, that the four-dimensional space V; ® V; is isomorphic, as an sl(2; ©) 
representation, to V2 ® Vo. 


Theorem C.1. Let m and n be non-negative integers with m > n. If we consider 
Vin ® V, as a representation of S\(2; C), then 


Vin ® Vn = Vintn ® Vintn 20: ® Vin—-n+2 (Sa Vm-n, 


where = denotes an isomorphism of s\(2; C) representations. 


Note that this theorem is consistent with the special case worked out earlier: 
V; 8 Vi = V2 @ Vo. For applications to the Wigner—Eckart theorem, a key property 
of the decomposition in Theorem C.1 is that it is multiplicity free. That is to say, 
each irreducible representation that occurs in the decomposition of Vn ® V, occurs 
only once. This is a special feature of the representations of sl(2; C); the analogous 
statement does not hold for tensor products of representations of other Lie algebras. 


Proof. Let us take a basis for each of the two spaces that is labeled by the 
eigenvalues for H. That is to say, we choose a basis Uum, Um-2, . . . , U—m for Vm and 
Un, Un—2,---,U-n for Va, with tm(H)u; = juj and m, (H)vk = kvg. Then the 
vectors of the form u; ® vg form a basis for Vm ® V,, and we compute that 


[tm(H) 8 I +1 8 m(H)lu; ® ve = (j +k)uj ® ve. 
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Thus, each of our basis elements is an eigenvector for the action of H on Vm ® Vn. 
The eigenvalues for the action of H range from m + n to —(m + n) in increments 
of 2. 

The eigenspace with eigenvalue m + n is one dimensional, spanned by um ® vp. 
Ifn > 0, then the eigenspace with eigenvalue m + n — 2 has dimension 2, spanned 
by Un—2 ® Vn and um ®@ Vn—2. Each time we decrease the eigenvalue of H by 2 
we increase the dimension of the corresponding eigenspace by 1, until we reach the 
eigenvalue m — n, which is spanned by the vectors 


Um—2n Q Vn, Um—2n+2 ® Un—2,--+,Um B V=p. 


This space has dimension n + 1. As we continue to decrease the eigenvalue of H in 
increments of 2, the dimensions remain constant until we reach eigenvalue n — m, 
at which point the dimensions begin decreasing by 1 until we reach the eigenvalue 
—m — n, for which the corresponding eigenspace has dimension one, spanned by 
U—m ® V_y. This pattern is illustrated by the following table, which lists, for the case 
of V4 ® V2, each eigenvalue for H and a basis for the corresponding eigenspace. 


Eigenvalue for H Basis 
6 u4 @ V2 
4 U2@v2 U4 ® vo 
2 Up @v2 UQV U4 @ v- 
0 U2 Q V2 UQV Uy @ v_y 
—2 U4 @ V2 U-2 @ VO Uy @ v2 
—4 u—4 @ Vo U2 @ v_2 
—6 u—4 Q v_2 


Consider now the vector um Vn, which is annihilated by X and is an eigenvector 
for H with eigenvalue m + n. Applying Y repeatedly gives a chain of eigenvectors 
for H with eigenvalues decreasing by 2 until they reach —m — n. By the proof 
of Theorem 4.32, the span W of these vectors is invariant under sl(2;C) and 
irreducible, isomorphic to V,,+,. The orthogonal complement of W is also invariant. 
Since W contains each of the eigenvalues of H with multiplicity one, each 
eigenvalue for H in W+ will have its multiplicity lowered by 1. In particular, m+n 
is not an eigenvalue for H in WŁ; the largest remaining eigenvalue is m + n — 2 
and this eigenvalue has multiplicity one (unless n = 0). Thus, if we start with an 
eigenvector for H in W+ with eigenvalue m + n — 2, this will be annihilated by X 
and will generate an irreducible invariant subspace isomorphic to Vm+n-2. 

We now continue on in the same way, at each stage looking at the orthogonal 
complement of the sum of all the invariant subspaces we have obtained in the 
previous stages. Each step reduces the multiplicity of each H -eigenvalue by 1 and 
thereby reduces the largest remaining H -eigenvalue by 2. This process will continue 
until there is nothing left, which will occur after Vm—n. oO 
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Theorem C.1 tells us, for example, that the 15-dimensional space V4 ® V2, we 
decompose as the direct sum of a seven-dimensional invariant subspace isomorphic 
to Ve, a five-dimensional invariant subspace isomorphic to V4, and a three- 
dimensional invariant subspace isomorphic to V2. By following the arguments in 
the proof of the theorem, we could, in principle, compute these subspaces explicitly. 


C.2 The Wigner—Eckart Theorem 


Recall that the Lie algebras su(2) and so(3) are isomorphic. Specifically, we use the 
bases {E, E2, E3} for Su(2) and {F\, F2, F3} for so(3) described in Example 3.27. 
The unique linear map @ : su(2) — so(3) such that ¢ (Ez) = Fy, k = 1,2,3, 
is a Lie algebra isomorphism. Thus, the representations of so(3) are in one-to-one 
correspondence with the representations of Su(2), which, in turn, are in one-to-one 
correspondence with the complex-linear representations of sl(2; C). In particular, 
the analysis of the decomposition of tensor products of sl(2;C) representations in 
the previous section applies also to SO(3) representations. 

Suppose now that IT is a representation of SO(3) acting on a finite-dimensional 
vector space V. Let End(V) denote the space of endomorphisms of V (i.e., the space 
of linear operators of V into itself). Then we can define an associated action of 
SO(3) on End(V) by the formula 


R-C =T(R)CT(R)!, (C.2) 


for all R € SO(3) and C € End(V). It is easy to check that this action constitutes 
a representation of SO(3). 


Definition C.2. Let (T, V) be a representation of SO(3). For any ordered triple 
C := (Cj, C2, C3) of operators on V and any vector v € R?, let v- C be the operator 


3 
v-C=9 vC. (C.3) 
j=1 
The triple C is a vector operator if 
(Rv) -C = I(R)(v- OI(R)! (C.4) 


for all R € SO(3). 


That is to say, the triple C is a vector operator if the map v > v-C intertwines the 
obvious action of SO(3) on R? with the action of SO(3) on End(V) given in (C.2). 
Note that if, say, R € SO(3) maps e; to e2, then (C.4) implies that 


Cy = T1(R)C, TR). (C.5) 


Equation (C.5) then says that Cı and C2 are “the same operator, up to rotation.” 
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Example C.3. Let V be the space of smooth functions on R? and define an action 
of SO(3) on V by 


CR) f)(x) = f(R™'x). (C.6) 


Define operators X = (X1, X2, X3) on V by 


(X; f)(x) = x; F(X). 


Then X is a vector operator. 


Note that X; is the operator of “multiplication by x;.” The operators X1, X2 and 
X3 are called the position operators in the physics literature. 


Proof. For any v € R? and R € SO(3), we have 


URV - X] f}(x) = (Rv): x) fE). 


On the other hand, we compute that 


[v DTR f(x) = (v- x) f(Rx), 


so that 


[TI(R)(v DID f(x) = (v- RDSE 
= ((Ry) -x) f(x), 


as required for a vector operator. o 
We are now ready for our first version of the Wigner—Eckart theorem. 


Theorem C.4. Let (II, V) be an irreducible finite-dimensional representation of 
SO(3), and let A and B be two vector operators on V, with A being nonzero. Then 
there exists a constant c such that 


B = cA. 


The computational significance of the theorem is as follows. For each irreducible 
representation V, if we can find one single vector operator A acting on V, then 
the action of any other vector operator on V is completely determined by a single 
constant c. There are two ingredients in the proof. The first is Schur’s lemma and the 
second is Theorem C.1, which implies (as we will see shortly) that when End(V) 
decomposes as a direct sum of irreducibles, the (complexification of) the standard 
representation of SO(3) occurs at most once. 
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Lemma C.5. Let TI be a finite-dimensional, irreducible representation of SO(3) 
acting on a vector space V, and let SO(3) act also on End(V) as in (C.2). Then 


End(V) = V & V, 


where = denotes an isomorphism of SO(3) representations. 


Proof. For any finite-dimensional vector space V, there is, by Definition 4.13, a 
unique linear map from Y : V @ V* — End(V) such that for all v € V and 
ob € V*, we have 


Yv 8 Pw) = ġw)v. 


By computing on a basis, it is easy to check that W is an isomorphism of vector 
spaces. If, in addition, V is a representation of SO(3), then Y is an isomorphism of 
representations, where SO(3) acts on V* as in Sect. 4.3.3 and acts on End(V) as in 
(C.2). (Compare Exercises 3 and 4 in Chapter 12.) Thus, End(V) = V @ V*. 
Meanwhile, every irreducible representation of SO(3) is isomorphic to its dual. 
This can be seen either by noting that there is only one irreducible representation 
in each dimension, or (more fundamentally) by noting that —/ is an element of the 
Weyl group of the A; root system. (Compare Exercise 10 in Chapter 10.) Thus, 
actually, End(V) = V & V, as claimed. Oo 


Proof of Theorem C.4. The action of SO(3) on R? is irreducible. Indeed, the 
associated action of SO(3) on C? is irreducible; this is the unique irreducible 
representation of SO(3) of dimension 3. Now, the linear map v +> v- A extends to a 
complex linear map from C? into End(V), and this extension is still an intertwining 
map. 

Meanwhile, End(V) = V ® V, by the lemma, and V ® V decomposes as a 
direct sum of irreducibles, as in Theorem C.1. In this decomposition, the three- 
dimensional irreducible representation V2 of SO(3) occurs exactly once, unless V 
is trivial. Thus, by Schur’s lemma, the map v +> v- A must be zero if V is trivial and 
must map into the unique copy of C? if V is nontrivial. Of course, the same holds 
for the map v > v- B. Applying Schur’s lemma a second time, we see that if A is 
nonzero, B must be a multiple of A. oO 


We now turn to a more general form of the Wigner—Eckart theorem, in which 
the space V on which the vector operators act is not assumed irreducible, or even 
finite dimensional. Rather, the theorem describes how vector operators act relative 
to a pair of irreducible invariant subspaces of V. 


Theorem C.6 (Wigner—Eckart). Let V be an inner product space, possibly infi- 
nite dimensional. Suppose TI is a representation of SO(3) acting on V in an 
inner-product-preserving fashion. Let W, and W, be finite-dimensional, irreducible 
subspaces of V. Suppose A and B are two vector operators on V and that (w, Ajw ) 
is nonzero for some w € Wi, w! € Wz, and j € {1,2,3}. Then there exists a 
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constant c such that 
(w, Bjw’) = c {w, Ajw’) 


for all w € W all w’ € Wy, andall j = 1,2,3. 


In many applications, the space V is L? (R°), the space of square-integrable 
functions on R?, and where SO(3) acts on L? (R°) by the same formula as in (C.6). 
The irreducible, SO(3)-invariant subspaces of L? (R?) are described in Section 17.7 
of [Hall]. The computational significance of the theorem is similar to that of 
Theorem C.4: For each pair of irreducible subspaces W; and W2, the “matrix 
entries” of any vector operator between W; and W, (i.e., the quantities (w, A;w ) 
with w € W, and w’ € W) are the same, up to a constant. Indeed, these matrix 
entries really depend only on the isomorphism class of W, and W2. Thus, if one 
can compute the matrix entries for some vector operator once and for all—for each 
pair of irreducible representations of SO(3)—the matrix entries for any other vector 
operator are then determined up to the calculation of a single constant. 


Proof. Note that the operators A; and B; (or more generally, v- A and v - B, 
for v € R°) do not necessarily map W> into W;. On the other hand, taking the 
inner product with an inner product of, say, A;w’ with an element w of W; has 
the effect of projecting A;w’ onto W,, since the inner product only depends on 
the component of A; w’ in W,. With this observation in mind, let P} : V > W, 
be the orthogonal projection onto W;. (This operator exists even if V is not a 
Hilbert space and can be constructed using an orthonormal basis for W,.) Let 
Hom(W,, W1) denote the space of linear operators from Wz to W; and define a 
linear map ġa : R? > Hom(W>, W1) by 


ba(v)(w) = Piv: A)(w) 


for all w € Wp. 
Now, since both W, and W, are invariant, if C belongs to Hom(W2, W,), then so 
does the operator 


TH(R)CT(R)! (C7) 


for all R € SO(3). Under the action (C.7), the space Hom(W2, W1) becomes a 
representation of SO(3). We now claim that ġa is an intertwining map from R? into 
Hom(W2, W1). To see this, note that since A is a vector operator, we have 


ga(Rv)(w) = PiTI(R)(v- A)TI(R)™' (w). (C.8) 
But since W is invariant and the action of SO(3) preserves the inner product, w+ is 


also invariant, in which case we can see that P; commutes with TI(R). Thus, (C.8) 
becomes 
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ga(Rv)(w) = T(R)pa(v) ', 


as claimed. 
Now, by a simple modification of the proof of Theorem C.4, we have 


Hom(W2, W1) = Wi @ Wy = Wi 8 Wa, 


where & denotes isomorphism of SO(3) representations. By Theorem C.1, in the 
decomposition of W; & Ws, the three-dimensional irreducible representation C? of 
SO(3) occurs at most once. If C? does not occur, then ġa must be identically zero, 
and similarly for the analogously defined map ¢g. If C? does occur, both ġa and 
op must map into the same irreducible subspace of Hom( W2, W1), and, by Schur’s 
lemma, they must be equal up to a constant. 

Finally, note that the orthogonal projection P; is self-adjoint on V and is equal 
to the identity on W,. Thus, 


(w, Pi(v- A)w’) = (Piw, (v- A)w’) = (w, (v- A)w’), 
and similarly with A replaced by B. Thus, since dg = cda, we have 
(w, (v - Byw’) = c (w, (v- Aw’) 


for all v € RÌ. Specializing to v = ej, j = 1,2,3, gives the claimed result. oO 


C.3 More on Vector Operators 


We now look a bit more closely at the notion of vector operator. We consider first the 
Lie algebra counterpart to Definition C.2. We use the basis {F;, F2, F3} for so(3) 
from Example 3.27. For j,k,/ € {1,2,3}, define ¢ ;,; as follows: 


0 if any two of j, k,l are equal 
Skim = 1 if (j, k, L) is a cyclic permutation of (1, 2, 3) 
—1 if (j,k, L) is an non-cyclic permutation of (1, 2, 3). 


Thus, for example, £112 = 0 and £132 = —1. The commutation relations among 
F, Fy, and F; may be written as 


Fy, Fl = Yen 


Proposition C.7. Let (Il, V) be a finite-dimensional representation of SO(3) and 
let x be the associated representation of so(3). Then a triple C = (Cy, C2, C3) of 
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operators is a vector operator if and only if 
(Xv)-C = n(X)(v- C) — (v: C)r(X) (C.9) 


for all X € so(3). This condition, in turn, holds if and only if Cı, C2, and C3 satisfy 
3 
[(F}), Ce] = Do ejuCi. (C.10) 
i=1 


In physics terminology, the operators x (F; ) are (up to a factor of if, where fi is 
Planck’s constant) the angular momentum operators. See Section 17.3 of [Hall] for 
more information. 


Proof. If SO(3) acts on End(V) as R -C = TI(R)CT(R)™|, the associated action 
of so(3) on End(V) is X -C = 2(X)C — Cx(X). The condition (C.9) is just the 
assertion that the map v +> v- C is an intertwining map between the action of so(3) 
on R? and its action on End(V). Since SO(3) is connected, it is easy to see that this 
condition is equivalent to the intertwining property in Definition C.2. 

Meanwhile, (C.9) will hold if and only if it holds for X = F; and v = ex, 
for all j,k = 1,2,3. Now, direct calculation with the matrices F, F2, and F; in 
Example 3.27 shows that Fje, = ae éjae). Putting X = F; and v = ex in (C.9) 
gives 


3 
> €jKiC) = [w(F;), Cx], 
I= 


as claimed. oO 


There is one last aspect of vector operators that should be mentioned. In quantum 
physics, it is expected that the vector space of states should carry an action of the 
rotation group SO(3). This action may not, however, be an ordinary representation, 
but rather a projective representation. This means that the action is allowed to 
be ill defined up to a constant. The reason for allowing this flexibility is that in 
quantum mechanics, two vectors that differ by a constant are considered the same 
physical state. (See Section 16.7.3 of [Hall] for more information on projective 
representations.) In particular, the space of states for a “spin one-half” particle 
carries a projective representation of SO(3) that does not come from an ordinary 
representation of SO(3). 

Suppose, for example, that V carries an action of the group SU(2), rather than 
SO(3). Suppose, also, that the action of the element —J € SU(2) on V is either 
as I or as —I. If the action of —7 € SU(2) on V is as J, then as in the proof of 
Proposition 4.35, the representation will descend to a representation of SO(3) = 
SU(2)/{7, —I} on V. Even if the action of —7 € SU(2) on V is as —/, we can still 
construct a representation of SO(3) that is well defined up to a constant; that is, V 
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still carries a projective representation of SO(3). Furthermore, the associated action 
of —I € SU(2) on End(V) will satisfy 


(-1)-C =(-INC(-I)"! =C. 


Thus, the action of SU(2) on End(V) still descends to an (ordinary) action of 
SO(3). We can, therefore, still define vector operators in the setting of projective 
representations of SO(3), and the proof of the Wigner—Eckart theorem goes through 
with only minor changes. 


Appendix D 
Completeness of Characters 


In this appendix, we sketch a proof of the completeness of characters (Theorem 
12.18) for an arbitrary compact Lie group, not assumed to be isomorphic to a matrix 
group. The proof requires some functional analytic results, notably the spectral 
theorem for compact self-adjoint operators. The needed results from functional 
analysis may be found, for example, in Chapter IT of [Knal]. 

As in the proof for matrix groups in Chapter 12, we first prove that general 
functions can be approximated by matrix entries of irreducible representations 
and then specialize to the case of class functions. We will begin by showing that 
any finite-dimensional, translation-invariant space of functions on K decomposes 
in terms of matrix entries. We will then construct such spaces of functions as 
eigenspaces of certain convolution operators. 

We consider the normalized left-invariant volume form «œ on K. If we translate 
a on the right by some x € K, the resulting form a* is easily seen to be, again, a 
left-invariant volume form, which must agree with a up to a constant. On the other 
hand, œ~ is still normalized, so it must actually agree with œ. Similarly, the pullback 
of œ by the map x +> x7! is easily seen to be left-invariant and normalized and 
thus coincides with a. Thus, œ is invariant under both left and right translations and 
under inversions. 

Now, integration of a smooth function f against œ satisfies 


if. 


Meanwhile, by the Stone—Weierstrass theorem (Theorem 7.33 in [Rud1]), every 
continuous function on K can be uniformly approximated by smooth functions. 
Thus, the map f > f x fæ extends by continuity from smooth functions to 
continuous functions, and if f is non-negative, f x Jæ will be non-negative. It then 


< supl f|. 
K 
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follows from the Riesz representation theorem that there is a unique measure u on 
the Borel o-algebra in K such that 


[ fa = f Fe ane 


for all continuous functions f on K. (See Theorems 2.14 and 2.18 in [Rud2]). Since 
a is normalized and invariant under left and right translations and inversions, the 
same is true of u. We refer to u as the (bi-invariant, normalized) Haar measure 
on K. We consider the Hilbert space L*(K), the space of (equivalence classes of 
almost-everywhere-equal) square-integrable functions on K with respect to u. 

We make use of the left translation and right translation operators, given by 


(Li f(y) = Fy) 
(Rx f(y) = f(xy). 
Both L. and R. constitute representations of K acting on L?(K). A subspace V C 


L?(K) is right invariant, left invariant, or bi-invariant if it is invariant under left 
translations, right translations, or both left and right translations. 


Proposition D.1. Suppose V C L?(K) is a finite-dimensional, bi-invariant sub- 
space and that each element of V is continuous. Then each element of V can 
be expressed as a finite linear combination of matrix entries for irreducible 
representations of K. 


Saying that an element f of K is continuous means, more precisely, that the 
equivalence class f has a (necessarily unique) continuous representative. 


Proof. By complete reducibility, we may decompose V into subspaces V; that are 
finite-dimensional and irreducible under the right action of K. Since the elements 
of V; are continuous, “evaluation at the identity” is a well-defined linear functional 
on the finite-dimensional space V;. Thus, there exists an element y; of V; such that 


Fe = (x f) 


forall f € V;. It follows that for all f € V;, we have 


F(x) = (Rx Ae) 
= (xj Rf) 
= trace(R, DOKID 
where XSI is the operator mapping g € V; to (f, g) Xj. Thus, each f € V; is 


a matrix entry of the irreducible representation (R., Vj) of K and each f € V isa 
linear combination of such matrix entries. o 
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Definition D.2. If f and g are in L*(K), the convolution of f and g is the function 
f * gon K given by 


Pepers Í. Pey )e0) du). (D.1) 


A key property of convolution is that convolution on the left commutes with 
translation on the right, and vice versa. That is to say, 


(Lx f) *g =Lx(f * 8) (D.2) 


and 


Intuitively, f x g can be viewed as a combination of right-translates of f, weighted 
by the function g. Thus, say, (D.2) boils down to the fact that right translation 
commutes with left translation, which is just a different way of stating that 
multiplication on K is associative. Rigorously, both (D.2) and (D.3) follow easily 
from the definition of convolution. 

Using the Cauchy—Schwarz inequality and the invariance of jz under translation 
and inversion, we see that 


IF * DOl IF lacey Well nace (D.4) 


for all x € K. If f and g are continuous, then (since K is compact) f is 
automatically uniformly continuous, from which it follows that f * g is continuous. 
For any f and g in L*(K), we can approximate f and g in L?(K) by continuous 
functions and show, with the help of (D.4), that f * g is continuous. We may also 
“move the norm inside the integral” in (D.1) to obtain the inequality 


If * gllzeæ SWF lz lgl - (D.5) 


Unlike convolution on the real line, convolution on a noncommutative group is, 
in general, noncommutative. Nevertheless, we have the following result. 


Proposition D.3. If f € L?(K) is a class function, then for all g € L?(K), we 
have 


fxg=g*f. 


Proof. If we make the change of variable z = y7! : ‘= 


zx! we find that 


x, so that y = xz and y7 


(f #2) = Í Jea Dror taa: 
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Since f is a class function, this expression reduces to 


(f + 2)(x) = Í. soz f@ du = wn. 


as claimed. oO 


We now introduce the properties of operators that will feature in our version of 
the spectral theorem. 


Definition D.4. Let H be a Hilbert space and A a bounded linear operator on H. 
Then A is self-adjoint if 

(u, Av) = (Au, v) 
for all u and v in H, and A is compact if for every bounded set E C H, the image 
of E under A has compact closure in H. 


Here compactness is understood to be relative to the norm topology on H. If H is 
infinite dimensional, the closed unit ball in H is not compact in the norm topology 
and thus, for example, the identity operator on H is not compact. 


Proposition D.5. If € L?(K) is real-valued and invariant under x +> x7, the 
convolution operator Cy given by 


Cif) = o* f 


is self-adjoint and compact. 


Proof. The operator Cg is an integral operator with integral kernel k(x, y) = 
¢(xy—'). Now, an integral operator is self-adjoint precisely if its kernel satisfies 
k(x, y) = k(y, x). In the case of Cy, this relation holds because 


(xy!) = px"), 


as a consequence of our assumptions on ġ. Meanwhile, since ¢ is square integrable 


over K and K has finite measure, the function k(x, y) = (xy!) is square 
integrable over K x K. It follows that Cy is a Hilbert-Schmidt operator, and 
therefore compact. (See Theorem 2.4 in Chapter IT of [Kna1].) oO 


Since K is compact, we can construct an inner product on the Lie algebra € of K 
that is invariant under the adjoint action of K. Thinking of € as the tangent space to 
K at the identity, we may then extend this inner product to an inner product on every 
other tangent space by using (equivalently) either left or right translations. Thus, we 
obtain a bi-invariant Riemannian metric on K, which we use in the following result. 


Proposition D.6. Let B.(1) denote the ball of radius £ about I € K. There exists 
a sequence (bn) of non-negative class functions on K such that (1) supp(¢n) C 
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Bin), (2) Pn (x7!) = n(x) for all x € K, and (3) Je n(x) d(x) =1. Tf (bn) 


is any such sequence, then 
‘lim [Lf * @n flle > 0 


forall f € L?(K). 


We may think of the functions ¢, in the proposition as approximating a “6- 
function” at the identity on K. 


Proof. Since the metric on K is bi-invariant, each B,(/) is invariant under the 
adjoint action of K. Thus, if Y, is any non-negative function with support in By/, (J) 
that integrates to 1, we may define 


WE [ babar) duty), 


and ¥, will be a class function, still supported in B,/,(/) and still integrating to 1. 
We may then define 


nl) = F + 0) 


and ¢, will have the required properties. (Note that d(x~!, T) = d(I, x) by the left 
invariance of the metric.) 

Suppose g is continuous—and thus uniformly continuous—on K. Then if n is 
large enough, we will have |g(y) — g(x)| < ¢ whenever d(y,x) < 1/n. Now, since 
H is normalized, we have 


(bn * g)(2) — g(0) = f har DeO- 200) daO) 
and so, for large n, 
Ibn * g)(X) — g(x) < f daly) 2”) -g duy) 


< e f bod duo) 
= £. 


We conclude that ¢, * g converges uniformly—and thus, also, in L?(K)—to g. 
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For any f € L?(K) is arbitrary, we choose a continuous function g close to f 
in L?(K) and observe that 


lon * f- fll 
< llon * f — bn * glir + Mion * 8 — glz + lg- Fleece) 
< llalli WF — gla + Won * g- gla + le- fll. 


where in the second inequality, we have used (D.5) and Proposition D.3. Since n 
is non-negative and integrates to 1, ||n||,1(x) = 1 for all n. Thus, if we take g with 
If — g|| < £/3 and then choose N so that ||¢, * g — g|| < £/3 for n > N, we see 
that ||¢, * f — f|| < eforn > N. o 


We now appeal to a general functional analytic result, the spectral theorem for 
compact self-adjoint operators. 


Theorem D.7 (Spectral Theorem for Compact Self-adjoint Operators). Sup- 
pose H is an infinite-dimensional, separable Hilbert space and A is a compact, 
self-adjoint operator on H. Then A has an orthonormal basis of eigenvectors with 
real eigenvalues that tend to zero. 


For a proof, see Section II.2 of [Knal]. Since the eigenvalues tend to zero, a fixed 
nonzero number can occur only finitely many times as an eigenvalue; that is, each 
eigenspace with a nonzero eigenvalue is finite dimensional. 


Theorem D.8. If K is any compact Lie group, the space of matrix entries is dense 
in L?(K). 


Proof. Let us say that a function f € L?(K) is K-finite if there exists a finite- 
dimensional space of continuous functions on K that contains f and is invariant 
under both left and right translation. In light of Proposition D.1, it suffices to show 
that the space of K-finite functions is dense in L?(K). 

To prove this claim, suppose g € L?(K) is orthogonal to every K-finite function 
f. If (bn) is as in Proposition D.6, then ¢, * g converges to g in L*(K). Since n 
is a class function, Proposition D.3 and the identities (D.2) and (D.3) tell us that the 
convolution operator Cy, commutes with both left and right translations. Thus, the 
eigenspaces of Cy, are invariant under both left and right translations. Furthermore, 
since n * f is continuous for any f € L?(K), the eigenvectors of Cy, with 
nonzero eigenvalues must be continuous. Finally, since Cy, is compact and self- 
adjoint, the eigenspaces for Cy, with nonzero eigenvalues are finite-dimensional. 
Thus, eigenvectors for Cy, with nonzero eigenvalues are K-finite. 

We conclude that g must be orthogonal to all the eigenvectors of Cy, with 
nonzero eigenvalues. Thus, by the spectral theorem, g must actually be in the 
eigenspace for Cg, with eigenvalue 0; that is, n * g = 0 for all n. Letting n tend to 
infinity, we conclude that g is the zero function. o 
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We may now prove (a generalization of) Theorem 12.18, without assuming ahead 
of time that K is a matrix group. 


Corollary D.9. If f is a square-integrable class function on K and f is orthogonal 
to the character of every finite-dimensional, irreducible representation of K, then f 
is zero almost everywhere. 


Proof. By Theorem D.8, we can find a sequence g, converging in L*(K) to f, 
where each g, is a linear combination of matrix entries. Since f is a class function, 
the L? distance between f(x) and g,(yxy~!) is independent of y. Thus, if we 
define f, by 


A= Í. gn(yxy!) duly), 


the sequence fp will also converge to f in L? (K). But by Lemma 12.20, each fy is 
a linear combination of characters of irreducible representations. Thus, f must be 
orthogonal to each f„, and we conclude that 


IAI = (Af) = lim (f fn) = 0, 


n—>Co 


from which the claimed result follows. ñ 
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A3 root system, 228 Cp root system, 192, 234 
A, root system, 189, 232 Campbell—Hausdorff formula, see Baker— 
abelian, see commutative Campbell—Hausdorff formula 
Ady, 63 canonical form, see Jordan canonical form 
adjoint Cartan subalgebra, 154, 174 

group, 401 Casimir element, 269, 298 

map, 63 center 

of a matrix, 409 discrete subgroup of, 28 

representation, 51 of a compact group, 314, 399 
ady, 51,64 of a Lie algebra, 51, 94 
affine transformation, 392 of a matrix Lie group, 94 
alcove, 389, 393, 404 centralizer, 335 
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analytically integral element, see integral class function, 331, 339 
element, analytic classical groups and Lie algebras, 188 
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Clebsch—Gordan theory, 89, 426 
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By root system, 201 Lie algebra, 49, 57, 73 
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Baker—Campbell—Hausdorff formula, 109, 113 representations of, 95 
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compact real form, 169 
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compactness, 16 
complete reducibility, 90, 273 
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Lie algebra, 49, 57 
matrix Lie group, 57 
complexification of a Lie algebra, 65 
conjugate linear, 413 
connected Lie subgroup, 129 
connectedness, 17,71 
contragredient representation, see dual 
representation 
convergence of a sequence of matrices, 
4 
convex hull, 159, 224, 265 
convolution, 437 
coroot, 179, 204 
real, 333 
coroot lattice, 379 
cover, 372 
covering group, see universal cover 
covering map, 372, 393 
cross product, 49, 74 


D 
D,, root system, 190, 233 
5 jk, 6 


ô, see half the sum of the positive roots 
dense subgroup, 310 
derivation, 51 
derivative of exponential mapping, 114 
derived series, 54 
diag(-), 156 
diagonalization, 408 
differential form, 93, 317, 419 
direct product of matrix Lie groups, 74, 
88 
direct sum 
of Lie algebras, 52 
of representations, 84 
of root systems, 198 
discrete subgroup, 28, 308, 339 
dominant, 147, 219, 344 
dual 
of root lattice, 399 
representation, 89, 165 
root system, 204 
space, 89, 414 
Dynkin diagram, 216, 235, 236 
extended, 405 
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E 
eigenspace, 408 
eigenvalue, 407 
eigenvector, 407 
Er, 189 
Euclidean group, 10 
exponential 
of a locally nilpotent operator, 260 
of a matrix, 31 
exponential map, 67 
surjectivity of, 314 
extended Dynkin diagram, see Dynkin 
diagram, extended 
extended Weyl group, 392 


F 
faithful, 77 
fiber bundle, 374 
Freudenthal’s formula, 293 
fundamental 
representations, 152 
weights, 219 
Weyl chamber, 210 
fundamental group, 371 
of classical groups, 373, 377 
of SO(3), 21, 24 


G 

I’, see kernel of exponential map 

Gz root system, 201, 208, 220, 402, 405 

Ga, 176 

general linear group, 4, 58 

generalized eigenvector, 410 

generalized orthogonal group, 8, 59 

group versus Lie algebra homomorphisms, 119 


H 
half the sum of the positive roots, 220, 357, 
366, 382 
Hausdorff property, 321 
Heisenberg group, 11, 110 
higher, 146, 222 
highest weight, 146, 243 
highest weight cyclic representation, 148, 244 
Hilbert—Schmidt 
inner product, 188, 194 
norm, 32, 46 
homomorphism 
of Lie algebras, 51, 60 
of matrix Lie groups, 22, 60, 72 
homotopic, 371 
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homotopy group, 373 
hyperplane, 198, 206, 207 


I 
ideal, 51, 53, 249, 254 
identity component of a matrix Lie group, 17, 
56 
indecomposable positive root, 207 
inhomogeneous Lorentz group, see Poincaré 
group 
inner product, 412 
integral element, 144, 147, 218, 242 
algebraic, 344, 381 
analytic, 344, 346, 381 
intertwining map, 78 
invariant subspace, 78 
irreducible 
representation, 78 
root system, 199 
isomorphism 
of Lie algebras, 51 
of matrix Lie groups, 22 
of representations, 78 
of root systems, 199 


J 
joint eigenvector, see simultaneous eigenvector 
Jordan canonical form, 411 


K 
kernel 
of a Lie group homomorphism, 63 
of the exponential map, 344, 378 
Killing form, 194 
Kostant multiplicity formula, 291 
Kostant partition function, 288 
Kreg, 383 


L 
length ratios in root systems, 199 
Lie algebra 

general, 49 

of a matrix Lie group, 55 
Lie group, 25 
Lie product formula, 40 
lift of a map, 372 
linear functional, 414 
local homomorphism, 119 
locally nilpotent operator, 260 
logarithm of a matrix, 36 
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long exact sequence of homotopy groups, 374 
loop, 371 

Lorentz group, 8 

lower, 146, 222 


M 

manifold, 25, 70 

mapping degree, 315, 329 

matrix entry of a representation, 355, 366, 440 
matrix Lie group, 4 

maximal commutative subalgebra, 175, 313 
maximal torus, 312 

M,(C), 4 

module, see representation 

morphism, see intertwining map 
multiplicity, 144, 343 


N 
negative root, 206 
nilpotent 

Lie algebra, 54 

matrix, 47 

operator, 410 
nonmatrix Lie group, 103 
nontrivial 

ideal, 53 

invariant subspace, 78 
norm of a matrix, 31 
normalizer, 313 
N(T), see normalizer 
null homotopic, 371 


(0) 
one-parameter subgroup, 41, 56 
orientation, 315 
orthogonal, 191 
complement, 414 
matrix, 7 
orthogonal group, 7, 58, 190 
fundamental group of, 375 
orthogonality 
of characters, 352 
of exponentials, 347 
orthonormal basis, 409 


P 

path connected, 17 

physicists’ convention, 57 
Poincaré—Birkhoff—Witt theorem, 250 
Poincaré group, 11 
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polar decomposition, 42, 127 singular element, 383, 384 
polynomials singular value, 315 
action of SU(2) on, 82 skew self-adjoint, 409 
action of SU(3) on, 166 sl(2; C), 53, 83, 96 
positive root, 206 SL(2; R), 127 
positive simple root, 145, 206 sl(3; C) 
product rule, 46 representations of, 141 


Weyl group of, 154 
slice lemma, 322 


Q smooth manifold, 25 
quotient manifold, 321, 374 SN decomposition, 34, 411 
SO(3) 
fundamental group of, 19 
R Lie algebra of, 62 
rank, 176, 198 representations of, 101 
real roots and coroots, 333 universal cover of, 126 
real weight, 343 solvable Lie algebra, 54, 55 
reflection, 198, 334, 392, 404 special linear group, 6, 58 
regular element, 383, 384 special orthogonal group, 7, 58 
regular value, 315 special unitary group, 6 
representation spectral theorem, 440 
of a Lie algebra, 77 spin, 101 
of a matrix Lie group, 77 Sp(), see compact symplectic group 
unitary, 80 square root 
root, 144, 176 of a positive matrix, 43 
lattice, 399 uniqueness of, 41 
real, 333 standard representation, 81 
space, 176 Stiefel diagram, 389 
string, 238 strictly dominant, 219 
system, 184, 197 structure constants, 52 
vector, 144, 176 SU(2) 
beer, 228 Lie algebra of, 62 
rotation, 8 relationship to SO(3), 22, 101, 126 
RP3, 21,375 representations of, 82, 96, 101 
RP”, 19 simple connectedness of, 19 
SU(3) 
representations of, 141 
S Weyl group of, 154 
s112 subalgebra, 50 
Sard’s theorem, 316 symplectic group, 9, 59, 192 


Schur’s lemma, 94 
Schwarz inequality, 32, 46 


self-adjoint T 
matrix, 409 tangent space at the identity, 56, 71 
operator, 438 tensor algebra, 248 
semidirect product, 392 tensor product 
semisimple, 169 of representations, 87, 106, 425 
simple connectedness, 18, 119, 372 of vector spaces, 85 
simple Lie algebra, 53 theorem of the highest weight, 146, 243, 345 
simple root, see positive simple root torus, 308 
simply connected, 122 torus theorem, 314, 326 
simultaneous trace of a matrix, 411 
diagonalization, 416 trivial representation, 81 


eigenvector, 415 Trotter product formula, 40 
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U 
unipotent matrix, 47 
unitarian trick, 92 
unitary group, 6, 58 

fundamental group of, 376 
universal cover, 126, 393 

of SL(2; R), 127 

of SO(3), 126 
universal enveloping algebra, 246 
universal property of tensor products, 86 
upper central series, 54 
upper triangular matrix, 410 


Vv 

Vandermonde determinant, 304 
vector operator, 428, 432 
Verma module, 244, 254, 294 
volume form, 325 
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W 
weight diagram, 158, 266 
weight of a representation, 144, 242, 
343 
Weyl 
chamber, 210, 212 
character formula, 275, 357 
denominator, 277, 288, 289 
dimension formula, 281 
group, 154, 198, 203, 212, 313, 
333 
integral formula, 330, 358 
Weyl-alternating function, 283 


Z 
Zometool system, 228 
Z(T), see centralizer 


